| flm_est {goffda} | R Documentation |
Estimation of functional linear models
Description
Estimation of the linear operator relating a
functional predictor X with a functional response Y in the
linear model
Y(t) = \int_a^b \beta(s, t) X(s)\,\mathrm{d}s + \varepsilon(t),
where X is a random variable in the Hilbert space of
square-integrable functions in [a, b], L^2([a, b]),
Y and \varepsilon are random variables
in L^2([c, d]), and s \in [a, b] and t \in [c, d].
The linear, Hilbert–Schmidt, integral operator is parametrized by
the bivariate kernel \beta \in L^2([a, b]) \otimes
L^2([c, d]). Its estimation is done through the truncated expansion
of \beta in the tensor product of the data-driven
bases of the Functional Principal Components (FPC) of
X and Y, and through the fitting of the resulting multivariate
linear model. The FPC basis for X is truncated in p
components, while the FPC basis for Y is truncated in q
components. Automatic selection of p and q is detailed below.
The particular cases in which either X or Y are
constant functions give either a scalar predictor or response.
The simple linear model arises if both X and Y are scalar,
for which \beta is a constant.
Usage
flm_est(X, Y, est_method = "fpcr_l1s", p = NULL, q = NULL,
thre_p = 0.99, thre_q = 0.99, lambda = NULL, X_fpc = NULL,
Y_fpc = NULL, compute_residuals = TRUE, centered = FALSE,
int_rule = "trapezoid", cv_verbose = FALSE, ...)
Arguments
X, Y |
samples of functional/scalar predictors and functional/scalar
response. Either |
est_method |
either |
p, q |
index vectors indicating the specific FPC to be
considered for the truncated bases expansions of |
thre_p, thre_q |
thresholds for the proportion of variance
that is explained, at least, by the first |
lambda |
regularization parameter |
X_fpc, Y_fpc |
FPC decompositions of |
compute_residuals |
whether to compute the fitted values |
centered |
flag to indicate if |
int_rule |
quadrature rule for approximating the definite
unidimensional integral: trapezoidal rule ( |
cv_verbose |
flag to display information about the estimation procedure
(passed to |
... |
further parameters to be passed to |
Details
flm_est deals seamlessly with either functional or scalar inputs
for the predictor and response. In the case of scalar inputs, the
corresponding dimension-related arguments (p, q,
thre_p or thre_q) will be ignored as in these cases either
p = 1 or q = 1.
The function translates the functional linear model into a multivariate
model with multivariate response and then estimates the
p \times q matrix of coefficients of \beta in the
tensor basis of the FPC of X and Y. The following estimation
methods are implemented:
-
"fpcr": Functional Principal Components Regression (FPCR); see details in Ramsay and Silverman (2005). -
"fpcr_l2": FPCR, with ridge penalty on the associated multivariate linear model. -
"fpcr_l1": FPCR, with lasso penalty on the associated multivariate linear model. -
"fpcr_l1s": FPCR, with FPC selected by lasso regression on the associated multivariate linear model.
The last three methods are explained in García-Portugués et al. (2021).
The p FPC of X and q FPC of Y are determined
as follows:
If
p = NULL, thenpis set asp_thre <- 1:j_thre, wherej_threis thej-th FPC ofXfor which the cumulated proportion of explained variance is greater thanthre_p. Ifp != NULL, thenp_thre <- p.If
q = NULL, then the same procedure is followed withthre_q, resultingq_thre.
Once p_thre and q_thre have been obtained, the methods
"fpcr_l1" and "fpcr_l1s" perform a second selection
of the FPC that are effectively considered in the estimation of \beta.
This subset of FPC (of p_thre) is encoded in p_hat. No further
selection of FPC is done for the methods "fpcr" and "fpcr_l2".
The flag compute_residuals controls if Y_hat,
Y_hat_scores, residuals, and residuals_scores are
computed. If FALSE, they are set to NULL. Y_hat equals
\hat Y_i(t) = \int_a^b \hat \beta(s, t) X_i(s) \,\mathrm{d}s and residuals
stands for \hat \varepsilon_i(t) = Y_i(t) - \hat Y_i(t), both for
i = 1, \ldots, n. Y_hat_scores and
residuals_scores
are the n\times q matrices of coefficients (or scores) of these
functions in the FPC of Y.
Missing values on X and Y are automatically removed.
Value
A list with the following entries:
Beta_hat |
estimated |
Beta_hat_scores |
the matrix of coefficients of |
H_hat |
hat matrix of the associated fitted multivariate
linear model, a matrix of size |
p_thre |
index vector indicating the FPC of |
p_hat |
index vector of the FPC considered by the methods
|
q_thre |
index vector indicating the FPC of |
est_method |
the estimation method employed. |
Y_hat |
fitted values, either an |
Y_hat_scores |
the matrix of coefficients of |
residuals |
residuals of the fitted model, either an
|
residuals_scores |
the matrix of coefficients of
|
X_fpc, Y_fpc |
FPC of |
lambda |
regularization parameter |
cv |
cross-validation object returned by
|
Author(s)
Eduardo García-Portugués and Javier Álvarez-Liébana.
References
García-Portugués, E., Álvarez-Liébana, J., Álvarez-Pérez, G. and Gonzalez-Manteiga, W. (2021). A goodness-of-fit test for the functional linear model with functional response. Scandinavian Journal of Statistics, 48(2):502–528. doi:10.1111/sjos.12486
Ramsay, J. and Silverman, B. W. (2005). Functional Data Analysis. Springer-Verlag, New York.
Examples
## Quick example of functional response and functional predictor
# Generate data
set.seed(12345)
n <- 50
X_fdata <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
Y_fdata <- 2 * X_fdata + epsilon
# Lasso-selection FPCR (p and q are estimated)
flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s")
## Functional response and functional predictor
# Generate data
set.seed(12345)
n <- 50
X_fdata <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
Y_fdata <- 2 * X_fdata + epsilon
# FPCR (p and q are estimated)
fpcr_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr")
fpcr_1$Beta_hat_scores
fpcr_1$p_thre
fpcr_1$q_thre
# FPCR (p and q are provided)
fpcr_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr",
p = c(1, 5, 2, 7), q = 2:1)
fpcr_2$Beta_hat_scores
fpcr_2$p_thre
fpcr_2$q_thre
# Ridge FPCR (p and q are estimated)
l2_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2")
l2_1$Beta_hat_scores
l2_1$p_hat
# Ridge FPCR (p and q are provided)
l2_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2",
p = c(1, 5, 2, 7), q = 2:1)
l2_2$Beta_hat_scores
l2_2$p_hat
# Lasso FPCR (p and q are estimated)
l1_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1")
l1_1$Beta_hat_scores
l1_1$p_thre
l1_1$p_hat
# Lasso estimator (p and q are provided)
l1_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1",
p = c(1, 5, 2, 7), q = 2:1)
l1_2$Beta_hat_scores
l1_2$p_thre
l1_2$p_hat
# Lasso-selection FPCR (p and q are estimated)
l1s_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s")
l1s_1$Beta_hat_scores
l1s_1$p_thre
l1s_1$p_hat
# Lasso-selection FPCR (p and q are provided)
l1s_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s",
p = c(1, 5, 2, 7), q = 1:4)
l1s_2$Beta_hat_scores
l1s_2$p_thre
l1s_2$p_hat
## Scalar response
# Generate data
set.seed(12345)
n <- 50
beta <- r_ou(n = 1, t = seq(0, 1, l = 201), sigma = 0.5, x0 = 3)
X_fdata <- fdata_cen(r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2))
epsilon <- rnorm(n, sd = 0.25)
Y <- drop(inprod_fdata(X_fdata1 = X_fdata, X_fdata2 = beta)) + epsilon
# FPCR
fpcr_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr")
fpcr_4$p_hat
# Ridge FPCR
l2_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l2")
l2_4$p_hat
# Lasso FPCR
l1_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l1")
l1_4$p_hat
# Lasso-selection FPCR
l1s_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l1s")
l1s_4$p_hat
## Scalar predictor
# Generate data
set.seed(12345)
n <- 50
X <- rnorm(n)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
beta <- r_ou(n = 1, t = seq(0, 1, l = 201), sigma = 0.5, x0 = 3)
beta$data <- matrix(beta$data, nrow = n, ncol = ncol(beta$data),
byrow = TRUE)
Y_fdata <- beta * X + epsilon
# FPCR
fpcr_4 <- flm_est(X = X, Y = Y_fdata, est_method = "fpcr")
plot(beta, col = 2)
lines(beta$argvals, drop(fpcr_4$Beta_hat))