flm_test {goffda}R Documentation

Goodness-of-fit test for functional linear models


Goodness-of-fit test of a functional linear model with functional response YL2([c,d])Y \in L^2([c, d]) and functional predictor XL2([a,b])X \in L^2([a, b]), where L2([a,b])L^2([a, b]) is the Hilbert space of square-integrable functions in [a,b][a, b].

The goodness-of-fit test checks the linearity of the regression model m:L2([a,b])L2([c,d])m:L^2([a, b])\rightarrow L^2([c, d]) that relates YY and XX by

Y(t)=m(X)+ε(t),Y(t) = m(X) + \varepsilon(t),

where ε\varepsilon is a random variable in L2([c,d])L^2([c, d]) and t[c,d]t \in [c, d]. The check is formalized as the test of the composite hypothesis

H0:m{mβ:βL2([a,b])L2([c,d])},H_0: m \in \{m_\beta : \beta \in L^2([a, b]) \otimes L^2([c, d])\},


mβ(X(s))(t)=abβ(s,t)X(s)dsm_\beta(X(s))(t) = \int_a^b \beta(s, t) X(s)\,\mathrm{d}s

is the linear, Hilbert–Schmidt, integral operator parametrized by the bivariate kernel β\beta. Its estimation is done by the truncated expansion of β\beta in the tensor product of the data-driven bases of Functional Principal Components (FPC) of XX and YY. The FPC basis for XX is truncated in pp components, while the FPC basis for YY is truncated in qq components.

The particular cases in which either XX or YY are constant functions give either a scalar predictor or response. The simple linear model arises if both XX and YY are scalar, for which β\beta is a constant.


flm_test(X, Y, beta0 = NULL, B = 500, est_method = "fpcr", p = NULL,
  q = NULL, thre_p = 0.99, thre_q = 0.99, lambda = NULL,
  boot_scores = TRUE, verbose = TRUE, plot_dens = TRUE,
  plot_proc = TRUE, plot_max_procs = 100, plot_max_p = 2,
  plot_max_q = 2, save_fit_flm = TRUE, save_boot_stats = TRUE,
  int_rule = "trapezoid", refit_lambda = FALSE, ...)


X, Y

samples of functional/scalar predictors and functional/scalar response. Either fdata objects (for functional variables) or vectors of length n (for scalar variables).


if provided (defaults to NULL), the simple null hypothesis H0:m=mβ0H_0: m = m_{\beta_0} is tested. beta0 must be a matrix of size
c(length(X$argvals), length(Y$argvals)). If X or Y are scalar, beta0 can be also an fdata object, with the same argvals as X or Y. Can also be a constant (understood as a shorthand for a matrix with all its entries equal to the constant).


number of bootstrap replicates. Defaults to 500.


either "fpcr" (Functional Principal Components Regression; FPCR), "fpcr_l2" (FPCR with ridge penalty), "fpcr_l1" (FPCR with lasso penalty) or "fpcr_l1s" (FPCR with lasso-selected FPC). If X is scalar, flm_est only considers "fpcr" as estimation method. See details below. Defaults to "fpcr_l1s".

p, q

either index vectors indicating the specific FPC to be considered for the truncated bases expansions of X and Y, respectively. If a single number for p is provided, then p <- 1:max(p) internally (analogously for q) and the first max(p) FPC are considered. If NULL (default), then a data-driven selection of p and q is done. See details below.

thre_p, thre_q

thresholds for the proportion of variance that is explained, at least, by the first pp and qq FPC of X and Y, respectively. These thresholds are employed for an (initial) automatic selection of pp and qq. Default to 0.99. thre_p (thre_q) is ignored if p (q) is provided.


regularization parameter λ\lambda for the estimation methods "fpcr_l2", "fpcr_l1", and "fpcr_l1s". If NULL (default), it is chosen with cv_glmnet.


flag to indicate if the bootstrap shall be applied to the scores of the residuals, rather than to the functional residuals. This improves the computational expediency notably. Defaults to TRUE.


flag to show information about the testing progress. Defaults to TRUE.


flag to indicate if a kernel density estimation of the bootstrap statistics shall be plotted. Defaults to TRUE.


whether to display a graphical tool to identify the degree of departure from the null hypothesis. If TRUE (default), the residual marked empirical process, projected in several FPC directions of X and Y, is shown, together with bootstrap analogues. The FPC directions are ones selected at the estimation stage.


maximum number of bootstrapped processes to plot in the graphical tool. Set as the minimum of plot_max_procs and B. Defaults to 100.

plot_max_p, plot_max_q

maximum number of FPC directions to be considered in the graphical tool. They limit the resulting plot to be at most of size c(plot_max_p, plot_max_q). Default to 2.

save_fit_flm, save_boot_stats

flag to return fit_flm and boot_*. If FALSE, these memory-expensive objects are set to NA. Default to TRUE.


quadrature rule for approximating the definite unidimensional integral: trapezoidal rule (int_rule = "trapezoid") and extended Simpson rule (int_rule = "Simpson") are available. Defaults to "trapezoid".


flag to reselect lambdalambda in each bootstrap replicate, incorporating its variability in the bootstrap calibration. Much more time consumig. Defaults to FALSE.


further parameters to be passed to cv_glmnet (and then to cv.glmnet) such as cv_1se, cv_nlambda or cv_parallel, among others.


The function implements the bootstrap-based goodness-of-fit test for the functional linear model with functional/scalar response and functional/scalar predictor, as described in Algorithm 1 in García-Portugués et al. (2021). The specifics are detailed there.

By default cv_1se = TRUE for cv_glmnet is considered, unless it is changed via .... This is the recommended choice for conducting the goodness-of-fit test based on regularized estimators, as the oversmoothed estimate of the regression model under the null hypothesis notably facilitates the calibration of the test (see García-Portugués et al., 2021).

The graphical tool obtained with plot_proc = TRUE is based on an extension of the tool described in García-Portugués et al. (2014).

Repeated observations on X are internally removed, as otherwise they would cause NaNs in Adot. Missing values on X and Y are also automatically removed.


An object of the htest class with the following elements:


test statistic.


pp-value of the test.


the bootstrapped test statistics, a vector of length B.


information on the type of test performed.


a vector with the dimensions pp and qq considered in the test statistic. These are the lengths of the outputs p and q.


the index of the FPC considered for X.


the index of the FPC considered for Y.


the output resulted from calling flm_est.


bootstrapped lambdalambda.


a list with the bootstrapped indexes of the FPC considered for X.


name of the value of data.


Eduardo García-Portugués.


García-Portugués, E., Álvarez-Liébana, J., Álvarez-Pérez, G. and Gonzalez-Manteiga, W. (2021). A goodness-of-fit test for the functional linear model with functional response. Scandinavian Journal of Statistics, 48(2):502–528. doi:10.1111/sjos.12486

García-Portugués, E., González-Manteiga, W. and Febrero-Bande, M. (2014). A goodness-of-fit test for the functional linear model with scalar response. Journal of Computational and Graphical Statistics, 23(3):761–778. doi:10.1080/10618600.2013.812519


## Quick example for functional response and predictor

# Generate data under H0
n <- 100
X_fdata <- r_ou(n = n, t = seq(0, 1, l = 101), sigma = 2)
epsilon <- r_ou(n = n, t = seq(0, 1, l = 101), sigma = 0.5)
Y_fdata <- epsilon

# Test the FLMFR
flm_test(X = X_fdata, Y = Y_fdata)

# Simple hypothesis
flm_test(X = X_fdata, Y = Y_fdata, beta0 = 0)

# Generate data under H1
n <- 100
sample_frm_fr <- r_frm_fr(n = n, scenario = 3, s = seq(0, 1, l = 101),
                          t = seq(0, 1, l = 101), nonlinear = "quadratic")
X_fdata <- sample_frm_fr[["X_fdata"]]
Y_fdata <- sample_frm_fr[["Y_fdata"]]

# Test the FLMFR
flm_test(X = X_fdata, Y = Y_fdata)

## Functional response and predictor

# Generate data under H0
n <- 50
B <- 100
t <- seq(0, 1, l = 201)
X_fdata <- r_ou(n = n, t = t, sigma = 2)
epsilon <- r_ou(n = n, t = t, sigma = 0.5)
Y_fdata <- epsilon

# With boot_scores = TRUE
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2", B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s", B = B)

# With boot_scores = FALSE
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s",
         boot_scores = FALSE, B = B)

# Simple hypothesis
flm_test(X = X_fdata, Y = Y_fdata, beta0 = 2, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y_fdata, beta0 = 0, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y_fdata, beta0 = 0, est_method = "fpcr_l1s", B = B)

# Generate data under H1
n <- 50
B <- 100
sample_frm_fr <- r_frm_fr(n = n, scenario = 3, s = t, t = t,
                          nonlinear = "quadratic")
X_fdata <- sample_frm_fr$X_fdata
Y_fdata <- sample_frm_fr$Y_fdata

# With boot_scores = TRUE
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2", B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s", B = B)

# With boot_scores = FALSE
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l2",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y_fdata, est_method = "fpcr_l1s",
         boot_scores = FALSE, B = B)

## Scalar response and functional predictor

# Generate data under H0
n <- 50
B <- 100
t <- seq(0, 1, l = 201)
X_fdata <- r_ou(n = n, t = t, sigma = 2)
beta <- r_ou(n = 1, t = t, sigma = 0.5, x0 = 2)
epsilon <- rnorm(n = n)
Y <- drop(inprod_fdata(X_fdata1 = X_fdata, X_fdata2 = beta) + epsilon)

# With boot_scores = TRUE
flm_test(X = X_fdata, Y = Y, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l2", B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1s", B = B)

# With boot_scores = FALSE
flm_test(X = X_fdata, Y = Y, est_method = "fpcr",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l2",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1s",
         boot_scores = FALSE, B = B)

# Simple hypothesis
flm_test(X = X_fdata, Y = Y, beta0 = beta, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y, beta0 = 0, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y, beta0 = 0, est_method = "fpcr_l1s", B = B)

# Generate data under H1
n <- 50
B <- 100
X_fdata <- r_ou(n = n, t = t, sigma = 2)
beta <- r_ou(n = 1, t = t, sigma = 0.5)
epsilon <- rnorm(n = n)
Y <- drop(exp(inprod_fdata(X_fdata1 = X_fdata^2, X_fdata2 = beta)) + epsilon)

# With boot_scores = TRUE
flm_test(X = X_fdata, Y = Y, est_method = "fpcr", B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l2", B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1s", B = B)

# With boot_scores = FALSE
flm_test(X = X_fdata, Y = Y, est_method = "fpcr",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l2",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1",
         boot_scores = FALSE, B = B)
flm_test(X = X_fdata, Y = Y, est_method = "fpcr_l1s",
         boot_scores = FALSE, B = B)

## Functional response and scalar predictor

# Generate data under H0
n <- 50
B <- 100
X <- rnorm(n)
t <- seq(0, 1, l = 201)
beta <- r_ou(n = 1, t = t, sigma = 0.5, x0 = 3)
beta$data <- matrix(beta$data, nrow = n, ncol = ncol(beta$data),
                    byrow = TRUE)
epsilon <- r_ou(n = n, t = t, sigma = 0.5)
Y_fdata <- X * beta + epsilon

# With boot_scores = TRUE
flm_test(X = X, Y = Y_fdata, est_method = "fpcr", B = B)

# With boot_scores = FALSE
flm_test(X = X, Y = Y_fdata, est_method = "fpcr", boot_scores = FALSE, B = B)

# Simple hypothesis
flm_test(X = X, Y = Y_fdata, beta0 = beta[1], est_method = "fpcr", B = B)
flm_test(X = X, Y = Y_fdata, beta0 = 0, est_method = "fpcr", B = B)

# Generate data under H1
n <- 50
B <- 100
X <- rexp(n)
beta <- r_ou(n = 1, t = t, sigma = 0.5, x0 = 3)
beta$data <- matrix(beta$data, nrow = n, ncol = ncol(beta$data),
                    byrow = TRUE)
epsilon <- r_ou(n = n, t = t, sigma = 0.5)
Y_fdata <- log(X * beta) + epsilon

# With boot_scores = TRUE
flm_test(X = X, Y = Y_fdata, est_method = "fpcr", B = B)

# With boot_scores = FALSE
flm_test(X = X, Y = Y_fdata, est_method = "fpcr", boot_scores = FALSE, B = B)

[Package goffda version 0.1.2 Index]