flm_test {goffda} | R Documentation |
Goodness-of-fit test for functional linear models
Description
Goodness-of-fit test of a functional linear model with
functional response Y \in L^2([c, d])
and functional predictor
X \in L^2([a, b])
, where L^2([a, b])
is the Hilbert space of
square-integrable functions in [a, b]
.
The goodness-of-fit test checks the linearity of the regression model
m:L^2([a, b])\rightarrow L^2([c, d])
that relates Y
and X
by
Y(t) = m(X) + \varepsilon(t),
where \varepsilon
is a random variable in
L^2([c, d])
and t \in [c, d]
. The check is formalized as the
test of the composite hypothesis
H_0: m \in \{m_\beta : \beta \in L^2([a, b]) \otimes L^2([c, d])\},
where
m_\beta(X(s))(t) = \int_a^b \beta(s, t) X(s)\,\mathrm{d}s
is the linear, Hilbert–Schmidt, integral operator parametrized by
the bivariate kernel \beta
. Its estimation is done by the
truncated expansion of \beta
in the tensor product of the
data-driven bases of Functional Principal Components (FPC) of
X
and Y
. The FPC basis for X
is truncated in p
components, while the FPC basis for Y
is truncated in q
components.
The particular cases in which either X
or Y
are
constant functions give either a scalar predictor or response.
The simple linear model arises if both X
and Y
are scalar,
for which \beta
is a constant.
Usage
flm_test(X, Y, beta0 = NULL, B = 500, est_method = "fpcr", p = NULL,
q = NULL, thre_p = 0.99, thre_q = 0.99, lambda = NULL,
boot_scores = TRUE, verbose = TRUE, plot_dens = TRUE,
plot_proc = TRUE, plot_max_procs = 100, plot_max_p = 2,
plot_max_q = 2, save_fit_flm = TRUE, save_boot_stats = TRUE,
int_rule = "trapezoid", refit_lambda = FALSE, ...)
Arguments
X , Y |
samples of functional/scalar predictors and functional/scalar
response. Either |
beta0 |
if provided (defaults to |
B |
number of bootstrap replicates. Defaults to |
est_method |
either |
p , q |
either index vectors indicating the specific FPC to be
considered for the truncated bases expansions of |
thre_p , thre_q |
thresholds for the proportion of variance
that is explained, at least, by the first |
lambda |
regularization parameter |
boot_scores |
flag to indicate if the bootstrap shall be applied to the
scores of the residuals, rather than to the functional residuals. This
improves the computational expediency notably. Defaults to |
verbose |
flag to show information about the testing progress. Defaults
to |
plot_dens |
flag to indicate if a kernel density estimation of the
bootstrap statistics shall be plotted. Defaults to |
plot_proc |
whether to display a graphical tool to identify the
degree of departure from the null hypothesis. If |
plot_max_procs |
maximum number of bootstrapped processes to plot in
the graphical tool. Set as the minimum of |
plot_max_p , plot_max_q |
maximum number of FPC directions to be
considered in the graphical tool. They limit the resulting plot to be at
most of size |
save_fit_flm , save_boot_stats |
flag to return |
int_rule |
quadrature rule for approximating the definite
unidimensional integral: trapezoidal rule ( |
refit_lambda |
flag to reselect |
... |
further parameters to be passed to |
Details
The function implements the bootstrap-based goodness-of-fit test for the functional linear model with functional/scalar response and functional/scalar predictor, as described in Algorithm 1 in García-Portugués et al. (2021). The specifics are detailed there.
By default cv_1se = TRUE
for cv_glmnet
is
considered, unless it is changed via ...
. This is the recommended
choice for conducting the goodness-of-fit test based on regularized
estimators, as the oversmoothed estimate of the regression model under the
null hypothesis notably facilitates the calibration of the test (see
García-Portugués et al., 2021).
The graphical tool obtained with plot_proc = TRUE
is based on
an extension of the tool described in García-Portugués et al. (2014).
Repeated observations on X
are internally removed, as otherwise they
would cause NaN
s in Adot
. Missing values on X
and
Y
are also automatically removed.
Value
An object of the htest
class with the following elements:
statistic |
test statistic. |
p.value |
|
boot_statistics |
the bootstrapped test statistics, a vector
of length |
method |
information on the type of test performed. |
parameter |
a vector with the dimensions |
p |
the index of the FPC considered for |
q |
the index of the FPC considered for |
fit_flm |
the output resulted from calling |
boot_lambda |
bootstrapped |
boot_p |
a list with the bootstrapped indexes of the FPC considered
for |
data.name |
name of the value of |
Author(s)
Eduardo García-Portugués.
References
García-Portugués, E., Álvarez-Liébana, J., Álvarez-Pérez, G. and Gonzalez-Manteiga, W. (2021). A goodness-of-fit test for the functional linear model with functional response. Scandinavian Journal of Statistics, 48(2):502–528. doi:10.1111/sjos.12486
García-Portugués, E., González-Manteiga, W. and Febrero-Bande, M. (2014). A goodness-of-fit test for the functional linear model with scalar response. Journal of Computational and Graphical Statistics, 23(3):761–778. doi:10.1080/10618600.2013.812519
Examples
## Quick example for functional response and predictor
# Generate data under H0
n <- 100
set.seed(987654321)
X_fdata <- r_ou(n = n, t = seq(0, 1, l = 101), sigma = 2)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 101), sigma = 0.5)
Y_fdata <- epsilon
# Test the FLMFR
flm_test(X = X_fdata, Y = Y_fdata)
# Simple hypothesis
flm_test(X = X_fdata, Y = Y_fdata, beta0 = 0)
# Generate data under H1
n <- 100
set.seed(987654321)
sample_frm_fr <- r_frm_fr(n = n, scenario = 3, s = seq(0, 1, l = 101),
t = seq(0, 1, l = 101), nonlinear = "quadratic")
X_fdata <- sample_frm_fr[["X_fdata"]]
Y_fdata <- sample_frm_fr[["Y_fdata"]]
# Test the FLMFR
flm_test(X = X_fdata, Y = Y_fdata)
## Functional response and predictor
# Generate data under H0
n <- 50
B <- 100
set.seed(987654321)
t <- seq(0, 1, l = 201)
X_fdata <- r_ou(n = n, t = t, sigma = 2)
epsilon <- r_ou(n = n, t = t, sigma = 0.5)
Y_fdata <- epsilon
# With boot_scores = TRUE
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2", B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s", B = B)
# With boot_scores = FALSE
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s",
boot_scores = FALSE, B = B)
# Simple hypothesis
flm_test(X = X_fdata, Y = Y_fdata, beta0 = 2, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y_fdata, beta0 = 0, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y_fdata, beta0 = 0, est_method = "fpcr_l1s", B = B)
# Generate data under H1
n <- 50
B <- 100
set.seed(987654321)
sample_frm_fr <- r_frm_fr(n = n, scenario = 3, s = t, t = t,
nonlinear = "quadratic")
X_fdata <- sample_frm_fr$X_fdata
Y_fdata <- sample_frm_fr$Y_fdata
# With boot_scores = TRUE
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2", B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s", B = B)
# With boot_scores = FALSE
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s",
boot_scores = FALSE, B = B)
## Scalar response and functional predictor
# Generate data under H0
n <- 50
B <- 100
set.seed(987654321)
t <- seq(0, 1, l = 201)
X_fdata <- r_ou(n = n, t = t, sigma = 2)
beta <- r_ou(n = 1, t = t, sigma = 0.5, x0 = 2)
epsilon <- rnorm(n = n)
Y <- drop(inprod_fdata(X_fdata1 = X_fdata, X_fdata2 = beta) + epsilon)
# With boot_scores = TRUE
flm_test(X = X_fdata, Y = Y, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l2", B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1s", B = B)
# With boot_scores = FALSE
flm_test(X = X_fdata, Y = Y, est_method = "fpcr",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l2",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1s",
boot_scores = FALSE, B = B)
# Simple hypothesis
flm_test(X = X_fdata, Y = Y, beta0 = beta, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y, beta0 = 0, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y, beta0 = 0, est_method = "fpcr_l1s", B = B)
# Generate data under H1
n <- 50
B <- 100
set.seed(987654321)
X_fdata <- r_ou(n = n, t = t, sigma = 2)
beta <- r_ou(n = 1, t = t, sigma = 0.5)
epsilon <- rnorm(n = n)
Y <- drop(exp(inprod_fdata(X_fdata1 = X_fdata^2, X_fdata2 = beta)) + epsilon)
# With boot_scores = TRUE
flm_test(X = X_fdata, Y = Y, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l2", B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1s", B = B)
# With boot_scores = FALSE
flm_test(X = X_fdata, Y = Y, est_method = "fpcr",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l2",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1",
boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1s",
boot_scores = FALSE, B = B)
## Functional response and scalar predictor
# Generate data under H0
n <- 50
B <- 100
set.seed(987654321)
X <- rnorm(n)
t <- seq(0, 1, l = 201)
beta <- r_ou(n = 1, t = t, sigma = 0.5, x0 = 3)
beta$data <- matrix(beta$data, nrow = n, ncol = ncol(beta$data),
byrow = TRUE)
epsilon <- r_ou(n = n, t = t, sigma = 0.5)
Y_fdata <- X * beta + epsilon
# With boot_scores = TRUE
flm_test(X = X, Y = Y_fdata, est_method = "fpcr", B = B)
# With boot_scores = FALSE
flm_test(X = X, Y = Y_fdata, est_method = "fpcr", boot_scores = FALSE, B = B)
# Simple hypothesis
flm_test(X = X, Y = Y_fdata, beta0 = beta[1], est_method = "fpcr", B = B)
flm_test(X = X, Y = Y_fdata, beta0 = 0, est_method = "fpcr", B = B)
# Generate data under H1
n <- 50
B <- 100
set.seed(987654321)
X <- rexp(n)
beta <- r_ou(n = 1, t = t, sigma = 0.5, x0 = 3)
beta$data <- matrix(beta$data, nrow = n, ncol = ncol(beta$data),
byrow = TRUE)
epsilon <- r_ou(n = n, t = t, sigma = 0.5)
Y_fdata <- log(X * beta) + epsilon
# With boot_scores = TRUE
flm_test(X = X, Y = Y_fdata, est_method = "fpcr", B = B)
# With boot_scores = FALSE
flm_test(X = X, Y = Y_fdata, est_method = "fpcr", boot_scores = FALSE, B = B)