gpdpgrow {growfunctions} | R Documentation |
Bayesian non-parametric dependent Gaussian process model for time-indexed functional data
Description
Estimates a collection of time-indexed functions with Gaussian process (GP) formulations where a Dirichlet process mixture allows sub-groupings of the functions to share the same GP covariance parameters. The GP formulation supports any number of additive GP covariance terms, expressing either or both of multiple trend and seasonality.
Usage
gpdpgrow(
y,
ipr,
time_points,
gp_cov,
sn_order,
jitter,
gp_shape,
gp_rate,
noise_shape,
noise_rate,
dp_shape,
dp_rate,
M_init,
lower,
upper,
sub_size,
w_star,
w,
n.iter,
n.burn,
n.thin,
n.tune,
progress,
b_move,
cluster,
s
)
Arguments
y |
A multivariate continuous response, specified as an N x T matrix, where |
ipr |
An optional input vector of inclusion probabilities for each observation unit in the case
the observed data were acquired through an informative sampling design, so that unbiased
inference about the population requires adjustments to the observed sample. Defaults to
|
time_points |
Inputs a vector of common time points at which the collections of functions were
observed (with the possibility of intermittent missingness). The length of |
gp_cov |
A vector of length |
sn_order |
A vector of length |
jitter |
A scalar numerical value added to the diagonal elements of the T x T GP covariance
matrix to stabilize computation. Defaults to |
gp_shape |
The shape parameter of the Gamma base distribution for the DP prior on
the P x N matrix of GP covariance parameters (where P
denotes the number of parameters for each of the N experimental units).
Defaults to |
gp_rate |
The rate parameter of the Gamma base distribution on GP covariance parameters.
Defaults to |
noise_shape |
The shape parameter of the Gamma base distribution on |
noise_rate |
The rate parameter of the Gamma base distribution on |
dp_shape |
The shape parameter for the Gamma prior on the DP concentration parameter,
|
dp_rate |
The rate parameter for the Gamma prior on the DP concentration parameter,
|
M_init |
Starting number of clusters of |
lower |
The lower end of the range to be used in conditionally sampling the GP covariance
parameters ( |
upper |
The upper end of the range to be used in conditionally sampling the GP covariance
parameters ( |
sub_size |
Integer vector whose length, |
w_star |
Integer value denoting the number of cluster locations to sample ahead of
observations in the auxiliary Gibbs sampler used to sample the number of clusters
and associated cluster assignments. A higher value reduces samplin auto-correlation, but
increases computational burden. Defaults to |
w |
Numeric value denoting the step width used to construct the interval from
which to draw a sample for each GP covariance parameter in the slice sampler. This
value is adaptively updated in the sampler tuning stage for each parameter to be equal
to the difference in the 0.95 and 0.05 sample quantiles for each of 5 block updates.
Defaults to |
n.iter |
Total number of MCMC iterations. |
n.burn |
Number of MCMC iterations to discard.
|
n.thin |
Gap between successive sampling iterations to save. |
n.tune |
Number of iterations (before ergodic chain instantiated) to adapt |
progress |
A boolean value denoting whether to display a progress bar during model execution.
Defaults to |
b_move |
A boolean value denoting whether to sample the GP function, |
cluster |
A boolean value denoting whether to employ DP mix model over set of GP functions or
to just use GP model with no clustering of covariance function parameters.
Defaults to |
s |
An N x 1 integer vector that inputs a fixed clustering, rather than sampling it.
Defaults to |
Value
S3 gpdpgrow
object, for which many methods are available to return and view results. Generic functions applied
to an object, res
of class gpdpgrow
, includes:
samples(res) |
contains ( |
resid(res) |
contains the model residuals. |
Note
The intended focus for this package are data composed of observed noisy functions (each of
length T
) for a set of experimental units where the functions may express dependence
among the experimental units
Author(s)
Terrance Savitsky tds151@gmail.com Daniell Toth danielltoth@yahoo.com
References
T. D. Savitsky and D. Toth (2014) Bayesian Non-parametric Models for Collections of Time- indexed Functions. submitted to: JRSS Series A (Statistics in Society).
T. D. Savitsky (2014) Bayesian Non-parametric Functional Mixture Estimation for Time-indexed data. submitted to: Annals of Applied Statistics.
T. D. Savitsky (2014) Bayesian Non-Parametric Mixture Estimation for Time-Indexed Functional
Data for R
. Submitted to: Journal of Statistical Software.
See Also
Examples
{
library(growfunctions)
## load the monthly employment count data for a collection of
## U.S. states from the Current
## Population Survey (cps)
data(cps)
## subselect the columns of N x T, y, associated with
## the years 2011 - 2013
## to examine the state level employment
## levels during the "great recession"
y_short <- cps$y[,(cps$yr_label %in% c(2011:2013))]
## uses default setting of a single "rational quadratic" covariance
## run for 500 iterations, with half discarded as burn-in to
## obtain a more useful result.
res_gp <- gpdpgrow(y = y_short,
n.iter = 4,
n.burn = 1,
n.thin = 1,
n.tune = 0)
## Two plots of estimated functions,
## 1. faceted by cluster
## 2. fitted functions vs noisy observations
## first plot will plot estimated denoised function,
## bb_i, for a single (randomly-selected) "state"
fit_plots_gp <- cluster_plot( object = res_gp,
units_name = "state",
units_label = cps$st,
single_unit = TRUE,
credible = TRUE )
## second plot will randomly select 6 states
## and plot their estimated denoised functions, bb_i.
## with setting "single_unit = FALSE".
## (Option "num_plot" may be set to plot
## any integer number of
## randomly-selected units.)
fit_plots_gp <- cluster_plot( object = res_gp,
units_name = "state",
units_label = cps$st,
single_unit = FALSE,
credible = TRUE )
}