cluster_plot {growfunctions}R Documentation

Plot estimated functions for experimental units faceted by cluster versus data to assess fit.

Description

Uses as input the output object from the gpdpgrow() and gmrfdpgrow() functions.

Usage

cluster_plot(
  object,
  N_clusters = NULL,
  time_points = NULL,
  units_name = "unit",
  units_label = NULL,
  date_field = NULL,
  x.axis.label = NULL,
  y.axis.label = NULL,
  smoother = TRUE,
  sample_rate = 1,
  single_unit = FALSE,
  credible = FALSE,
  num_plot = NULL
)

Arguments

object

A gpdpgrow or gmrfdpgrow object.

N_clusters

Denotes the number of largest sized (in terms of membership) clusters to plot. Defaults to all clusters.

time_points

Inputs a vector of common time points at which the collections of functions were observed (with the possibility of intermittent missingness). The length of time_points should be equal to the number of columns in the data matrix, y. Defaults to time_points = 1:ncol(y).

units_name

The plot label for observation units. Defaults to units_name = "function".

units_label

A vector of labels to apply to the observation units with length equal to the number of unique units. Defaults to sequential numeric values as input with data, y.

date_field

A vector of Date values for labeling the x-axis tick marks. Defaults to 1:T .

x.axis.label

Text label for x-axis. Defaults to "time".

y.axis.label

Text label for y-axis. Defaults to "function values".

smoother

A scalar boolean input indicating whether to co-plot a smoother line through the functions in each cluster.

sample_rate

A numeric value in (0,1] indicating percent of functions to randomly sample within each cluster to address over-plotting. Defaults to 1.

single_unit

A scalar boolean indicating whether to plot the fitted vs data curve for only a single experimental units (versus a random sample of 6). Defaults to single_unit = FALSE.

credible

A scalar boolean indicating whether to plot 95 percent credible intervals for estimated functions, bb, when plotting fitted functions versus data. Defaults to credible = FALSE

num_plot

A scalar integer indicating how many randomly-selected functions to plot (each in it's own plot panel) in the plot of functions versus the observed time series in the case that single_unit == TRUE. Defaults to num_plot = 6.

Value

A list object containing the plot of estimated functions, faceted by cluster, and the associated data.frame object.

p.cluster

A ggplot2 plot object

dat.cluster

A data.frame object used to generate p.cluster.

Author(s)

Terrance Savitsky tds151@gmail.com

See Also

gpdpgrow, gmrfdpgrow

Examples

{
library(growfunctions)

## load the monthly employment count data for a collection of 
## U.S. states from the Current 
## Population Survey (cps)
data(cps)
## subselect the columns of N x T, y, associated with 
## the years 2008 - 2013
## to examine the state level employment levels 
## during the "great recession"
y_short             <- cps$y[,(cps$yr_label %in% c(2008:2013))]

## Run the DP mixture of iGMRF's to estimate posterior 
## distributions for model parameters
## Under default RW2(kappa) = order 2 trend 
## precision term
res_gmrf            <- gmrfdpgrow(y = y_short, 
                                     n.iter = 40, 
                                     n.burn = 20, 
                                     n.thin = 1) 
                                     
## 2 plots of estimated functions: 1. faceted by cluster and fit;
## 2.  data for experimental units.
## for a group of randomly-selected functions
fit_plots_gmrf      <- cluster_plot( object = res_gmrf, 
                                     units_name = "state", 
                                     units_label = cps$st, 
                                     single_unit = FALSE, 
                                     credible = TRUE )   
}

[Package growfunctions version 0.16 Index]