flm_est {goffda} | R Documentation |
Estimation of functional linear models
Description
Estimation of the linear operator relating a
functional predictor with a functional response
in the
linear model
where is a random variable in the Hilbert space of
square-integrable functions in
,
,
and
are random variables
in
, and
and
.
The linear, Hilbert–Schmidt, integral operator is parametrized by
the bivariate kernel . Its estimation is done through the truncated expansion
of
in the tensor product of the data-driven
bases of the Functional Principal Components (FPC) of
and
, and through the fitting of the resulting multivariate
linear model. The FPC basis for
is truncated in
components, while the FPC basis for
is truncated in
components. Automatic selection of
and
is detailed below.
The particular cases in which either or
are
constant functions give either a scalar predictor or response.
The simple linear model arises if both
and
are scalar,
for which
is a constant.
Usage
flm_est(X, Y, est_method = "fpcr_l1s", p = NULL, q = NULL,
thre_p = 0.99, thre_q = 0.99, lambda = NULL, X_fpc = NULL,
Y_fpc = NULL, compute_residuals = TRUE, centered = FALSE,
int_rule = "trapezoid", cv_verbose = FALSE, ...)
Arguments
X , Y |
samples of functional/scalar predictors and functional/scalar
response. Either |
est_method |
either |
p , q |
index vectors indicating the specific FPC to be
considered for the truncated bases expansions of |
thre_p , thre_q |
thresholds for the proportion of variance
that is explained, at least, by the first |
lambda |
regularization parameter |
X_fpc , Y_fpc |
FPC decompositions of |
compute_residuals |
whether to compute the fitted values |
centered |
flag to indicate if |
int_rule |
quadrature rule for approximating the definite
unidimensional integral: trapezoidal rule ( |
cv_verbose |
flag to display information about the estimation procedure
(passed to |
... |
further parameters to be passed to |
Details
flm_est
deals seamlessly with either functional or scalar inputs
for the predictor and response. In the case of scalar inputs, the
corresponding dimension-related arguments (p
, q
,
thre_p
or thre_q
) will be ignored as in these cases either
or
.
The function translates the functional linear model into a multivariate
model with multivariate response and then estimates the
matrix of coefficients of
in the
tensor basis of the FPC of
X
and Y
. The following estimation
methods are implemented:
-
"fpcr"
: Functional Principal Components Regression (FPCR); see details in Ramsay and Silverman (2005). -
"fpcr_l2"
: FPCR, with ridge penalty on the associated multivariate linear model. -
"fpcr_l1"
: FPCR, with lasso penalty on the associated multivariate linear model. -
"fpcr_l1s"
: FPCR, with FPC selected by lasso regression on the associated multivariate linear model.
The last three methods are explained in García-Portugués et al. (2021).
The FPC of
X
and FPC of
Y
are determined
as follows:
If
p = NULL
, thenp
is set asp_thre <- 1:j_thre
, wherej_thre
is the-th FPC of
X
for which the cumulated proportion of explained variance is greater thanthre_p
. Ifp != NULL
, thenp_thre <- p
.If
q = NULL
, then the same procedure is followed withthre_q
, resultingq_thre
.
Once p_thre
and q_thre
have been obtained, the methods
"fpcr_l1"
and "fpcr_l1s"
perform a second selection
of the FPC that are effectively considered in the estimation of .
This subset of FPC (of
p_thre
) is encoded in p_hat
. No further
selection of FPC is done for the methods "fpcr"
and "fpcr_l2"
.
The flag compute_residuals
controls if Y_hat
,
Y_hat_scores
, residuals
, and residuals_scores
are
computed. If FALSE
, they are set to NULL
. Y_hat
equals
and
residuals
stands for , both for
.
Y_hat_scores
and
residuals_scores
are the matrices of coefficients (or scores) of these
functions in the FPC of
Y
.
Missing values on X
and Y
are automatically removed.
Value
A list with the following entries:
Beta_hat |
estimated |
Beta_hat_scores |
the matrix of coefficients of |
H_hat |
hat matrix of the associated fitted multivariate
linear model, a matrix of size |
p_thre |
index vector indicating the FPC of |
p_hat |
index vector of the FPC considered by the methods
|
q_thre |
index vector indicating the FPC of |
est_method |
the estimation method employed. |
Y_hat |
fitted values, either an |
Y_hat_scores |
the matrix of coefficients of |
residuals |
residuals of the fitted model, either an
|
residuals_scores |
the matrix of coefficients of
|
X_fpc , Y_fpc |
FPC of |
lambda |
regularization parameter |
cv |
cross-validation object returned by
|
Author(s)
Eduardo García-Portugués and Javier Álvarez-Liébana.
References
García-Portugués, E., Álvarez-Liébana, J., Álvarez-Pérez, G. and Gonzalez-Manteiga, W. (2021). A goodness-of-fit test for the functional linear model with functional response. Scandinavian Journal of Statistics, 48(2):502–528. doi:10.1111/sjos.12486
Ramsay, J. and Silverman, B. W. (2005). Functional Data Analysis. Springer-Verlag, New York.
Examples
## Quick example of functional response and functional predictor
# Generate data
set.seed(12345)
n <- 50
X_fdata <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
Y_fdata <- 2 * X_fdata + epsilon
# Lasso-selection FPCR (p and q are estimated)
flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s")
## Functional response and functional predictor
# Generate data
set.seed(12345)
n <- 50
X_fdata <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
Y_fdata <- 2 * X_fdata + epsilon
# FPCR (p and q are estimated)
fpcr_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr")
fpcr_1$Beta_hat_scores
fpcr_1$p_thre
fpcr_1$q_thre
# FPCR (p and q are provided)
fpcr_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr",
p = c(1, 5, 2, 7), q = 2:1)
fpcr_2$Beta_hat_scores
fpcr_2$p_thre
fpcr_2$q_thre
# Ridge FPCR (p and q are estimated)
l2_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2")
l2_1$Beta_hat_scores
l2_1$p_hat
# Ridge FPCR (p and q are provided)
l2_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2",
p = c(1, 5, 2, 7), q = 2:1)
l2_2$Beta_hat_scores
l2_2$p_hat
# Lasso FPCR (p and q are estimated)
l1_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1")
l1_1$Beta_hat_scores
l1_1$p_thre
l1_1$p_hat
# Lasso estimator (p and q are provided)
l1_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1",
p = c(1, 5, 2, 7), q = 2:1)
l1_2$Beta_hat_scores
l1_2$p_thre
l1_2$p_hat
# Lasso-selection FPCR (p and q are estimated)
l1s_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s")
l1s_1$Beta_hat_scores
l1s_1$p_thre
l1s_1$p_hat
# Lasso-selection FPCR (p and q are provided)
l1s_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s",
p = c(1, 5, 2, 7), q = 1:4)
l1s_2$Beta_hat_scores
l1s_2$p_thre
l1s_2$p_hat
## Scalar response
# Generate data
set.seed(12345)
n <- 50
beta <- r_ou(n = 1, t = seq(0, 1, l = 201), sigma = 0.5, x0 = 3)
X_fdata <- fdata_cen(r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2))
epsilon <- rnorm(n, sd = 0.25)
Y <- drop(inprod_fdata(X_fdata1 = X_fdata, X_fdata2 = beta)) + epsilon
# FPCR
fpcr_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr")
fpcr_4$p_hat
# Ridge FPCR
l2_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l2")
l2_4$p_hat
# Lasso FPCR
l1_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l1")
l1_4$p_hat
# Lasso-selection FPCR
l1s_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l1s")
l1s_4$p_hat
## Scalar predictor
# Generate data
set.seed(12345)
n <- 50
X <- rnorm(n)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
beta <- r_ou(n = 1, t = seq(0, 1, l = 201), sigma = 0.5, x0 = 3)
beta$data <- matrix(beta$data, nrow = n, ncol = ncol(beta$data),
byrow = TRUE)
Y_fdata <- beta * X + epsilon
# FPCR
fpcr_4 <- flm_est(X = X, Y = Y_fdata, est_method = "fpcr")
plot(beta, col = 2)
lines(beta$argvals, drop(fpcr_4$Beta_hat))