MFPCA {MFPCA} | R Documentation |
Multivariate functional principal component analysis for functions on different (dimensional) domains
Description
This function calculates a multivariate functional principal component
analysis (MFPCA) based on i.i.d. observations x_1, \ldots, x_N
of a
multivariate functional data-generating process X = (X^{(1)}, \ldots
X^{(p)})
with elements X^{(j)} \in
L^2(\mathcal{T}_j)
defined on a domain
\mathcal{T}_j \subset IR^{d_j}
. In particular, the
elements can be defined on different (dimensional) domains. The results
contain the mean function, the estimated multivariate functional principal
components \hat \psi_1, \ldots, \hat \psi_M
(having the same structure
as x_i
), the associated eigenvalues \hat \nu_1 \geq \ldots \geq
\hat \nu_M > 0
and the individual scores \hat \rho_{im} =
\widehat{<x_i, \psi_m>}
. Moreover,
estimated trajectories for each observation based on the truncated
Karhunen-Loeve representation
\hat x_i = \sum_{m = 1}^M \hat \rho_{im}
\hat \psi_m
are given
if desired (fit = TRUE
). The implementation of the observations
x_i = (x_i^{(1)}, \ldots , x_i^{(p)}),~ i = 1 , \ldots, N
, the mean function and
multivariate functional principal components \hat \psi_1, \ldots, \hat
\psi_M
uses the multiFunData
class, which is defined
in the package funData.
Usage
MFPCA(
mFData,
M,
uniExpansions,
weights = rep(1, length(mFData)),
fit = FALSE,
approx.eigen = FALSE,
bootstrap = FALSE,
nBootstrap = NULL,
bootstrapAlpha = 0.05,
bootstrapStrat = NULL,
verbose = options()$verbose
)
Arguments
mFData |
A |
M |
The number of multivariate functional principal components to calculate. |
uniExpansions |
A list characterizing the (univariate) expansion that is calculated for each element. See Details. |
weights |
An optional vector of weights, defaults to |
fit |
Logical. If |
approx.eigen |
Logical. If |
bootstrap |
Logical. If |
nBootstrap |
The number of bootstrap iterations to use. Defaults to
|
bootstrapAlpha |
A vector of numerics (or a single number) giving the
significance level for bootstrap intervals. Defaults to |
bootstrapStrat |
A stratification variable for bootstrap. Must be a
factor of length |
verbose |
Logical. If |
Details
Weighted MFPCA
If the elements vary considerably in domain,
range or variation, a weight vector w_1 , \ldots, w_p
can be supplied
and the MFPCA is based on the weighted scalar product
<<f,g>>_w =
\sum_{j = 1}^p w_j \int_{\mathcal{T}_j} f^{(j)}(t) g^{(j)}(t) \mathrm{d}
t
and the
corresponding weighted covariance operator \Gamma_w
.
Bootstrap
If bootstrap = TRUE
, pointwise bootstrap
confidence bands are generated for the multivariate eigenvalues \hat
\nu_1, \ldots, \hat \nu_M
as well as for multivariate functional principal
components \hat \psi_1, \ldots, \hat \psi_M
. The parameter nBootstrap
gives the number of bootstrap
iterations. In each iteration, the observations are resampled on the level of
(multivariate) functions and the whole MFPCA is recalculated. In particular,
if the univariate basis depends on the data (FPCA approaches), basis
functions and scores are both re-estimated. If the basis functions are fixed
(e.g. splines), the scores from the original estimate are used to speed up
the calculations. The confidence bands for the eigenfunctions are calculated
separately for each element as pointwise percentile bootstrap confidence
intervals. Analogously, the confidence bands for the eigenvalues are also
percentile bootstrap confidence bands. The significance level(s) can be
defined by the bootstrapAlpha
parameter, which defaults to 5%. As a
result, the MFPCA
function returns a list CI
of the same length
as bootstrapAlpha
, containing the lower and upper bounds of the
confidence bands for the principal components as multiFunData
objects
of the same structure as mFData
. The confidence bands for the
eigenvalues are returned in a list CIvalues
, containing the upper and
lower bounds for each significance level.
Univariate Expansions
The multivariate functional principal
component analysis relies on a univariate basis expansion for each element
X^{(j)}
. The univariate basis representation is calculated using
the univDecomp
function, that passes the univariate functional
observations and optional parameters to the specific function. The univariate
decompositions are specified via the uniExpansions
argument in the
MFPCA
function. It is a list of the same length as the mFData
object, i.e. having one entry for each element of the multivariate functional
data. For each element, uniExpansion
must specify at least the type of
basis functions to use. Additionally, one may add further parameters. The
following basis representations are supported:
Given basis functions. Then
uniExpansions[[j]] = list(type = "given", functions, scores, ortho)
, wherefunctions
is afunData
object on the same domain asmFData
, containing the given basis functions. The parametersscores
andortho
are optional.scores
is anN x K
matrix containing the scores (or coefficients) of the observed functions for the given basis functions, whereN
is the number of observed functions andK
is the number of basis functions. Note that the scores need to be demeaned to give meaningful results. If scores are not supplied, they are calculated using the given basis functions. The parameterortho
specifies whether the given basis functions are orthonormalorhto = TRUE
or notortho = FALSE
. Ifortho
is not supplied, the functions are treated as non-orthogonal.scores
andortho
are not checked for plausibility, use them at your own risk!Univariate functional principal component analysis. Then
uniExpansions[[j]] = list(type = "uFPCA", nbasis, pve, npc, makePD)
, wherenbasis,pve,npc,makePD
are parameters passed to thePACE
function for calculating the univariate functional principal component analysis.Basis functions expansions from the package fda. Then
uniExpansions[[j]] = list(type = "fda", ...)
, where...
are passed tofunData2fd
, which heavily builds oneval.fd
. If fda is not available, a warning is thrown.Spline basis functions (not penalized). Then
uniExpansions[[j]] = list(type = "splines1D", bs, m, k)
, wherebs,m,k
are passed to the functionsunivDecomp
andunivExpansion
. For two-dimensional tensor product splines, usetype = "splines2D"
.Spline basis functions (with smoothness penalty). Then
uniExpansions[[j]] = list(type = "splines1Dpen", bs, m, k)
, wherebs,m,k
are passed to the functionsunivDecomp
andunivExpansion
. Analogously to the unpenalized case, usetype = "splines2Dpen"
for 2D penalized tensor product splines.-
Cosine basis functions. Use
uniExpansions[[j]] = list(type = "DCT2D", qThresh, parallel)
for functions one two-dimensional domains (images) andtype = "DCT3D"
for 3D images. The calculation is based on the discrete cosine transform (DCT) implemented in the C-libraryfftw3
. If this library is not available, the function will throw a warning.qThresh
gives the quantile for hard thresholding the basis coefficients based on their absolute value. Ifparallel = TRUE
, the coefficients for different images are calculated in parallel.
See univDecomp
and univExpansion
for details.
Value
An object of class MFPCAfit
containing the following
components:
values |
A vector of estimated eigenvalues |
functions |
A
|
scores |
A matrix of dimension |
vectors |
A matrix representing the eigenvectors associated with the combined univariate score vectors. This might be helpful for calculating predictions. |
normFactors |
The normalizing factors used for calculating the multivariate eigenfunctions and scores. This might be helpful when calculation predictions. |
meanFunction |
A multivariate functional
data object, corresponding to the mean function. The MFPCA is applied to
the de-meaned functions in |
fit |
A
|
CI |
A
list of the same length as |
CIvalues |
A list of the same
length as |
References
C. Happ, S. Greven (2018): Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains. Journal of the American Statistical Association, 113(522): 649-659. DOI: doi:10.1080/01621459.2016.1273115
C. Happ-Kurz (2020): Object-Oriented Software for Functional Data. Journal of Statistical Software, 93(5): 1-38. DOI: doi:10.18637/jss.v093.i05
See Also
See Happ-Kurz (2020. doi:10.18637/jss.v093.i05) for a general
introduction to the funData package and it's interplay with
MFPCA. This file also includes a case study on how to use
MFPCA
. Useful functions: multiFunData
,
PACE
, univDecomp
, univExpansion
,
summary
,
plot
,
scoreplot
Examples
oldPar <- par(no.readonly = TRUE)
set.seed(1)
### simulate data (one-dimensional domains)
sim <- simMultiFunData(type = "split", argvals = list(seq(0,1,0.01), seq(-0.5,0.5,0.02)),
M = 5, eFunType = "Poly", eValType = "linear", N = 100)
# MFPCA based on univariate FPCA
uFPCA <- MFPCA(sim$simData, M = 5, uniExpansions = list(list(type = "uFPCA"),
list(type = "uFPCA")))
summary(uFPCA)
plot(uFPCA) # plot the eigenfunctions as perturbations of the mean
scoreplot(uFPCA) # plot the scores
# MFPCA based on univariate spline expansions
splines <- MFPCA(sim$simData, M = 5, uniExpansions = list(list(type = "splines1D", k = 10),
list(type = "splines1D", k = 10)),
fit = TRUE) # calculate reconstruction, too
summary(splines)
plot(splines) # plot the eigenfunctions as perturbations of the mean
scoreplot(splines) # plot the scores
### Compare estimates to true eigenfunctions
# flip to make results more clear
uFPCA$functions <- flipFuns(sim$trueFuns, uFPCA$functions)
splines$functions <- flipFuns(sim$trueFuns, splines$functions)
par(mfrow = c(1,2))
plot(sim$trueFuns[[1]], main = "Eigenfunctions\n1st Element", lwd = 2)
plot(uFPCA$functions[[1]], lty = 2, add = TRUE)
plot(splines$functions[[1]], lty = 3, add = TRUE)
plot(sim$trueFuns[[2]], main = "Eigenfunctions\n2nd Element", lwd = 2)
plot(uFPCA$functions[[2]], lty = 2, add = TRUE)
plot(splines$functions[[2]], lty = 3, add = TRUE)
legend("bottomleft", c("True", "uFPCA", "splines"), lty = 1:3, lwd = c(2,1,1))
# Test reconstruction for the first 10 observations
plot(sim$simData[[1]], obs = 1:10, main = "Reconstruction\n1st Element", lwd = 2)
plot(splines$fit[[1]], obs = 1:10, lty = 2, col = 1, add = TRUE)
plot(sim$simData[[2]], obs = 1:10, main = "Reconstruction\n2nd Element", lwd = 2)
plot(splines$fit[[2]], obs = 1:10, lty = 2, col = 1, add = TRUE)
legend("bottomleft", c("True", "Reconstruction"), lty = c(1,2), lwd = c(2,1))
# MFPCA with Bootstrap-CI for the first 2 eigenfunctions
### ATTENTION: Takes long
splinesBoot <- MFPCA(sim$simData, M = 2, uniExpansions = list(list(type = "splines1D", k = 10),
list(type = "splines1D", k = 10)),
bootstrap = TRUE, nBootstrap = 100, bootstrapAlpha = c(0.05, 0.1), verbose = TRUE)
summary(splinesBoot)
plot(splinesBoot$functions[[1]], ylim = c(-2,1.5))
plot(splinesBoot$CI$alpha_0.05$lower[[1]], lty = 2, add = TRUE)
plot(splinesBoot$CI$alpha_0.05$upper[[1]], lty = 2, add = TRUE)
plot(splinesBoot$CI$alpha_0.1$lower[[1]], lty = 3, add = TRUE)
plot(splinesBoot$CI$alpha_0.1$upper[[1]], lty = 3, add = TRUE)
abline(h = 0, col = "gray")
plot(splinesBoot$functions[[2]], ylim = c(-1,2.5))
plot(splinesBoot$CI$alpha_0.05$lower[[2]], lty = 2, add = TRUE)
plot(splinesBoot$CI$alpha_0.05$upper[[2]], lty = 2, add = TRUE)
plot(splinesBoot$CI$alpha_0.1$lower[[2]], lty = 3, add = TRUE)
plot(splinesBoot$CI$alpha_0.1$upper[[2]], lty = 3, add = TRUE)
abline(h = 0, col = "gray")
legend("topleft", c("Estimate", "95% CI", "90% CI"), lty = 1:3, lwd = c(2,1,1))
# Plot 95% confidence bands for eigenvalues
plot(1:2, splinesBoot$values, pch = 20, ylim = c(0, 1.5),
main = "Estimated eigenvalues with 95% CI",
xlab = "Eigenvalue no.", ylab = "")
arrows(1:2, splinesBoot$CIvalues$alpha_0.05$lower,
1:2, splinesBoot$CIvalues$alpha_0.05$upper,
length = 0.05, angle = 90, code = 3)
points(1:2, sim$trueVals[1:2], pch = 20, col = 4)
legend("topright", c("Estimate", "True value"), pch = 20, col = c(1,4))
### simulate data (two- and one-dimensional domains)
### ATTENTION: Takes long
set.seed(2)
sim <- simMultiFunData(type = "weighted",
argvals = list(list(seq(0,1,0.01), seq(-1,1,0.02)), list(seq(-0.5,0.5,0.01))),
M = list(c(4,5), 20), eFunType = list(c("Fourier", "Fourier"), "Poly"),
eValType = "exponential", N = 150)
# MFPCA based on univariate spline expansions (for images) and univariate FPCA (for functions)
pca <- MFPCA(sim$simData, M = 10,
uniExpansions = list(list(type = "splines2D", k = c(10,12)),
list(type = "uFPCA")))
summary(pca)
plot(pca) # plot the eigenfunctions as perturbations of the mean
scoreplot(pca) # plot the scores
### Compare to true eigenfunctions
# flip to make results more clear
pca$functions <- flipFuns(sim$trueFuns[1:10], pca$functions)
par(mfrow = c(5,2), mar = rep(2,4))
for(m in 2:6) # for m = 1, image.plot (used in plot(funData)) produces an error...
{
plot(sim$trueFuns[[1]], main = paste("True, m = ", m), obs = m)
plot(pca$functions[[1]], main = paste("Estimate, m = ", m), obs = m)
}
par(mfrow = c(1,1))
plot(sim$trueFuns[[2]], main = "Eigenfunctions (2nd element)", lwd = 2, obs= 1:5)
plot(pca$functions[[2]], lty = 2, add = TRUE, obs= 1:5)
legend("bottomleft", c("True", "MFPCA"), lty = 1:2, lwd = c(2,1))
par(oldPar)