cov_hall {registr} | R Documentation |
Covariance estimation after Hall et al. (2008)
Description
Internal function for the estimation of the covariance matrix of the latent
process using the approach of Hall et al. (2008). Used in the
two-step GFPCA approach implemented in gfpca_twoStep
.
This function is an adaptation of the implementation of Jan
Gertheiss and Ana-Maria Staicu for Gertheiss et al. (2017), with focus on
higher (RAM) efficiency for large data settings.
Usage
cov_hall(
Y,
index_evalGrid,
Kt = 25,
Kc = 10,
family = "gaussian",
diag_epsilon = 0.01,
make_pd = TRUE
)
Arguments
Y |
Dataframe. Should have values id, value, index. |
index_evalGrid |
Grid for the evaluation of the covariance structure. |
Kt |
Number of P-spline basis functions for the estimation of the marginal mean. Defaults to 25. |
Kc |
Number of marginal P-spline basis functions for smoothing the covariance surface. Defaults to 10. |
family |
One of |
diag_epsilon |
Small constant to which diagonal elements of the covariance matrix are set if they are smaller. Defaults to 0.01. |
make_pd |
Indicator if positive (semi-)definiteness of the returned
latent covariance should be ensured via |
Details
The implementation deviates from the algorithm described in Hall (2008) in one crucial step – we compute the crossproducts of centered observations and smooth the surface of these crossproducts directly instead of computing and smoothing the surface of crossproducts of uncentered observations and subsequently subtracting the (crossproducts of the) mean function. The former seems to yield smoother eigenfunctions and fewer non-positive-definite covariance estimates.
If the data Y
or the crossproduct matrix contain more than
100,000
rows or elements, the estimation of the marginal mean or
the smoothing step of the covariance matrix are performed by
using the discretization-based estimation algorithm in bam
rather than the gam
estimation algorithm.
Value
Covariance matrix with dimension time_evalGrid x time_evalGrid
.
Author(s)
Alexander Bauer alexander.bauer@stat.uni-muenchen.de and Fabian Scheipl, based on work of Jan Gertheiss and Ana-Maria Staicu
References
Hall, P., Müller, H. G., & Yao, F. (2008). Modelling sparse generalized longitudinal observations with latent Gaussian processes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(4), 703–723.
Gertheiss, J., Goldsmith, J., & Staicu, A. M. (2017). A note on modeling sparse exponential-family functional response curves. Computational statistics & data analysis, 105, 46–52.
Examples
data(growth_incomplete)
index_grid = c(1.25, seq(from = 2, to = 18, by = 1))
cov_matrix = registr:::cov_hall(growth_incomplete, index_evalGrid = index_grid)