flm_est {goffda} | R Documentation |
Estimation of functional linear models
Description
Estimation of the linear operator relating a
functional predictor X
with a functional response Y
in the
linear model
Y(t) = \int_a^b \beta(s, t) X(s)\,\mathrm{d}s + \varepsilon(t),
where X
is a random variable in the Hilbert space of
square-integrable functions in [a, b]
, L^2([a, b])
,
Y
and \varepsilon
are random variables
in L^2([c, d])
, and s \in [a, b]
and t \in [c, d]
.
The linear, Hilbert–Schmidt, integral operator is parametrized by
the bivariate kernel \beta \in L^2([a, b]) \otimes
L^2([c, d])
. Its estimation is done through the truncated expansion
of \beta
in the tensor product of the data-driven
bases of the Functional Principal Components (FPC) of
X
and Y
, and through the fitting of the resulting multivariate
linear model. The FPC basis for X
is truncated in p
components, while the FPC basis for Y
is truncated in q
components. Automatic selection of p
and q
is detailed below.
The particular cases in which either X
or Y
are
constant functions give either a scalar predictor or response.
The simple linear model arises if both X
and Y
are scalar,
for which \beta
is a constant.
Usage
flm_est(X, Y, est_method = "fpcr_l1s", p = NULL, q = NULL,
thre_p = 0.99, thre_q = 0.99, lambda = NULL, X_fpc = NULL,
Y_fpc = NULL, compute_residuals = TRUE, centered = FALSE,
int_rule = "trapezoid", cv_verbose = FALSE, ...)
Arguments
X , Y |
samples of functional/scalar predictors and functional/scalar
response. Either |
est_method |
either |
p , q |
index vectors indicating the specific FPC to be
considered for the truncated bases expansions of |
thre_p , thre_q |
thresholds for the proportion of variance
that is explained, at least, by the first |
lambda |
regularization parameter |
X_fpc , Y_fpc |
FPC decompositions of |
compute_residuals |
whether to compute the fitted values |
centered |
flag to indicate if |
int_rule |
quadrature rule for approximating the definite
unidimensional integral: trapezoidal rule ( |
cv_verbose |
flag to display information about the estimation procedure
(passed to |
... |
further parameters to be passed to |
Details
flm_est
deals seamlessly with either functional or scalar inputs
for the predictor and response. In the case of scalar inputs, the
corresponding dimension-related arguments (p
, q
,
thre_p
or thre_q
) will be ignored as in these cases either
p = 1
or q = 1
.
The function translates the functional linear model into a multivariate
model with multivariate response and then estimates the
p \times q
matrix of coefficients of \beta
in the
tensor basis of the FPC of X
and Y
. The following estimation
methods are implemented:
-
"fpcr"
: Functional Principal Components Regression (FPCR); see details in Ramsay and Silverman (2005). -
"fpcr_l2"
: FPCR, with ridge penalty on the associated multivariate linear model. -
"fpcr_l1"
: FPCR, with lasso penalty on the associated multivariate linear model. -
"fpcr_l1s"
: FPCR, with FPC selected by lasso regression on the associated multivariate linear model.
The last three methods are explained in García-Portugués et al. (2021).
The p
FPC of X
and q
FPC of Y
are determined
as follows:
If
p = NULL
, thenp
is set asp_thre <- 1:j_thre
, wherej_thre
is thej
-th FPC ofX
for which the cumulated proportion of explained variance is greater thanthre_p
. Ifp != NULL
, thenp_thre <- p
.If
q = NULL
, then the same procedure is followed withthre_q
, resultingq_thre
.
Once p_thre
and q_thre
have been obtained, the methods
"fpcr_l1"
and "fpcr_l1s"
perform a second selection
of the FPC that are effectively considered in the estimation of \beta
.
This subset of FPC (of p_thre
) is encoded in p_hat
. No further
selection of FPC is done for the methods "fpcr"
and "fpcr_l2"
.
The flag compute_residuals
controls if Y_hat
,
Y_hat_scores
, residuals
, and residuals_scores
are
computed. If FALSE
, they are set to NULL
. Y_hat
equals
\hat Y_i(t) = \int_a^b \hat \beta(s, t) X_i(s) \,\mathrm{d}s
and residuals
stands for \hat \varepsilon_i(t) = Y_i(t) - \hat Y_i(t)
, both for
i = 1, \ldots, n
. Y_hat_scores
and
residuals_scores
are the n\times q
matrices of coefficients (or scores) of these
functions in the FPC of Y
.
Missing values on X
and Y
are automatically removed.
Value
A list with the following entries:
Beta_hat |
estimated |
Beta_hat_scores |
the matrix of coefficients of |
H_hat |
hat matrix of the associated fitted multivariate
linear model, a matrix of size |
p_thre |
index vector indicating the FPC of |
p_hat |
index vector of the FPC considered by the methods
|
q_thre |
index vector indicating the FPC of |
est_method |
the estimation method employed. |
Y_hat |
fitted values, either an |
Y_hat_scores |
the matrix of coefficients of |
residuals |
residuals of the fitted model, either an
|
residuals_scores |
the matrix of coefficients of
|
X_fpc , Y_fpc |
FPC of |
lambda |
regularization parameter |
cv |
cross-validation object returned by
|
Author(s)
Eduardo García-Portugués and Javier Álvarez-Liébana.
References
García-Portugués, E., Álvarez-Liébana, J., Álvarez-Pérez, G. and Gonzalez-Manteiga, W. (2021). A goodness-of-fit test for the functional linear model with functional response. Scandinavian Journal of Statistics, 48(2):502–528. doi:10.1111/sjos.12486
Ramsay, J. and Silverman, B. W. (2005). Functional Data Analysis. Springer-Verlag, New York.
Examples
## Quick example of functional response and functional predictor
# Generate data
set.seed(12345)
n <- 50
X_fdata <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
Y_fdata <- 2 * X_fdata + epsilon
# Lasso-selection FPCR (p and q are estimated)
flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s")
## Functional response and functional predictor
# Generate data
set.seed(12345)
n <- 50
X_fdata <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
Y_fdata <- 2 * X_fdata + epsilon
# FPCR (p and q are estimated)
fpcr_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr")
fpcr_1$Beta_hat_scores
fpcr_1$p_thre
fpcr_1$q_thre
# FPCR (p and q are provided)
fpcr_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr",
p = c(1, 5, 2, 7), q = 2:1)
fpcr_2$Beta_hat_scores
fpcr_2$p_thre
fpcr_2$q_thre
# Ridge FPCR (p and q are estimated)
l2_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2")
l2_1$Beta_hat_scores
l2_1$p_hat
# Ridge FPCR (p and q are provided)
l2_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2",
p = c(1, 5, 2, 7), q = 2:1)
l2_2$Beta_hat_scores
l2_2$p_hat
# Lasso FPCR (p and q are estimated)
l1_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1")
l1_1$Beta_hat_scores
l1_1$p_thre
l1_1$p_hat
# Lasso estimator (p and q are provided)
l1_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1",
p = c(1, 5, 2, 7), q = 2:1)
l1_2$Beta_hat_scores
l1_2$p_thre
l1_2$p_hat
# Lasso-selection FPCR (p and q are estimated)
l1s_1 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s")
l1s_1$Beta_hat_scores
l1s_1$p_thre
l1s_1$p_hat
# Lasso-selection FPCR (p and q are provided)
l1s_2 <- flm_est(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s",
p = c(1, 5, 2, 7), q = 1:4)
l1s_2$Beta_hat_scores
l1s_2$p_thre
l1s_2$p_hat
## Scalar response
# Generate data
set.seed(12345)
n <- 50
beta <- r_ou(n = 1, t = seq(0, 1, l = 201), sigma = 0.5, x0 = 3)
X_fdata <- fdata_cen(r_ou(n = n, t = seq(0, 1, l = 201), sigma = 2))
epsilon <- rnorm(n, sd = 0.25)
Y <- drop(inprod_fdata(X_fdata1 = X_fdata, X_fdata2 = beta)) + epsilon
# FPCR
fpcr_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr")
fpcr_4$p_hat
# Ridge FPCR
l2_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l2")
l2_4$p_hat
# Lasso FPCR
l1_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l1")
l1_4$p_hat
# Lasso-selection FPCR
l1s_4 <- flm_est(X = X_fdata, Y = Y, est_method = "fpcr_l1s")
l1s_4$p_hat
## Scalar predictor
# Generate data
set.seed(12345)
n <- 50
X <- rnorm(n)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 201), sigma = 0.5)
beta <- r_ou(n = 1, t = seq(0, 1, l = 201), sigma = 0.5, x0 = 3)
beta$data <- matrix(beta$data, nrow = n, ncol = ncol(beta$data),
byrow = TRUE)
Y_fdata <- beta * X + epsilon
# FPCR
fpcr_4 <- flm_est(X = X, Y = Y_fdata, est_method = "fpcr")
plot(beta, col = 2)
lines(beta$argvals, drop(fpcr_4$Beta_hat))