R: Robust scalar-on-function regression

rob.sf.reg {robflreg}

R Documentation

Robust scalar-on-function regression

Description

This function is used to perform both classical and robust scalar-on-function regression model

Y = \sum_{m=1}^M \int X_m(s) \beta_m(s) ds + X.scl \gamma + \epsilon,

where Y denotes the scalar response, X_m(s) denotes the m-th functional predictor, \beta_m(s) denotes the m-th regression coefficient function, X.scl denotes the matrix of scalar predictors, \gamma denotes the vector of coefficients for the scalar predictors' matrix, and \epsilon is the error function, which is assumed to follow standard normal distribution.

Usage

rob.sf.reg(Y, X, X.scl = NULL, emodel = c("classical", "robust"),
fmodel = c("LTS", "MM", "S", "tau"), nbasis = NULL, gp = NULL, ncomp = NULL)

Arguments

`Y`	An `n \times 1`-dimensional matrix containing the observations of scalar response `Y`, where `n` denotes the sample size.
`X`	A list consisting of `M` functional predictors `X_m(s), 1\le m\le M`. Each element of `X` is an `n \times p_m`-dimensional matrix containing the observations of `m`-th functional predictor `X_m(s)`, where `n` is the sample size and `p_m` denotes the number of grid points for `X_m(s)`.
`X.scl`	An `n \times R`-dimensional matrix consisting of scalar predictors `X_r, 1\le r\le R`.
`emodel`	Method to be used for functional principal component decomposition. Possibilities are "classical"" and "robust".
`fmodel`	Fitting model used to estimate the function-on-function regression model. Possibilities are "LTS", "MM", "S", and "tau".
`nbasis`	A vector with length `M`. Its `m`-th value denotes the number of B-spline basis expansion functions to be used to approximate the functional principal components for the `m`-th functional predictor `X_m(s)`. If `NULL`, then, `min(20, p_m/4)` number of B-spline basis expansion functions are used for each functional predictor, where `p_m` denotes the number of grid points for `X_m(s)`.
`gp`	A list with length `M`. The `m`-th element of `gp` is a vector containing the grid points of the `m`-th functional predictor `X_m(s)`. If `NULL`, then, `p_m` equally spaced time points in the interval [0, 1] are used for the `m`-th functional predictor.
`ncomp`	A vector with length `M`. Its `m`-th value denotes the number of functional principal components to be computed for the `m`-th functional predictor `X_m(s)`. If `NULL`, then, for each functional predictor, the number whose usage results in at least 95% explained variation is used as the number of principal components.

Details

When performing a scalar-on-function regression model based on the functional principal component analysis, first, the functional predictors X_m(s), 1\le m\le M are decomposed by the functional principal component analysis method:

X_m(s) = \bar{X}_m(s) + \sum_{l=1}^{K_m} \xi_{ml} \psi_{ml}(s),

where \bar{X}_m(s) is the mean function, \psi_{ml}(s) is the weight function, and \xi_{ml} = \int (X_m(s) - \bar{X}_m(s)) \psi_{ml}(s) is the principal component score for the m-th functional predictor. Assume that the m-th regression coefficient function admits the expansion

\beta_m(s) = \sum_{l=1}^{K_m} b_{ml} \psi_{ml}(s),

where b_{ml} = \int \beta_m(s) \psi_{m}(s) ds. Then, the following multiple regression model is obtained for the scalar response:

\hat{Y} = \bar{Y} + \sum_{m=1}^M \sum_{l=1}^{K_m} b_{ml} \xi_{ml} + X.scl \gamma.

If emodel = "classical", then, the least-squares method is used to estimate the scalar-on-function regression model.

If emodel = "robust", then, the robust functional principal component analysis of Bali et al. (2011) along with the method specified in fmodel is used to estimate the scalar-on-function regression model.

If fmodel = "LTS", then, the least trimmed squares robust regression of Rousseeuw (1984) is used to estimate the scalar-on-function regression model.

If fmodel = "MM", then, the MM-type regression estimator described in Yohai (1987) and Koller and Stahel (2011) is used to estimate the scalar-on-function regression model.

If fmodel = "S", then, the S estimator is used to estimate the scalar-on-function regression model.

If fmodel = "tau", then, the tau estimator proposed by Salibian-Barrera et al. (2008) is used to estimate the scalar-on-function regression model.

Value

A list object with the following components:

`data`	A list of matrices including the original scalar response and both the scalar and functional predictors.
`fitted.values`	An `n \times 1`-dimensional matrix containing the fitted values of the scalar response.
`residuals`	An `n \times 1`-dimensional matrix containing the residuals.
`fpca.results`	A list object containing the functional principal component analysis results of the functional predictors variables.
`model.details`	A list object containing model details, such as number of basis functions, number of principal components, and grid points used for each functional predictor variable.

Author(s)

Ufuk Beyaztas and Han Lin Shang

References

J. L. Bali and G. Boente and D. E. Tyler and J. -L.Wang (2011), "Robust functional principal components: A projection-pursuit approach", The Annals of Statistics, 39(6), 2852-2882.

P. J. Rousseeuw (1984), "Least median of squares regression", Journal of the American Statistical Association, 79(388), 871-881.

P. J. Rousseeuw and K. van Driessen (1999) "A fast algorithm for the minimum covariance determinant estimator", Technometrics, 41(3), 212-223.

V. J. Yohai (1987), "High breakdown-point and high efficiency estimates for regression", The Annals of Statistics, 15(2), 642-65.

M. Koller and W. A. Stahel (2011), "Sharpening Wald-type inference in robust regression for small samples", Computational Statistics & Data Analysis, 55(8), 2504-2515.

M. Salibian-Barrera and G. Willems and R. Zamar (2008), "The fast-tau estimator for regression", Journal of Computational and Graphical Statistics, 17(3), 659-682

Examples

sim.data <- generate.sf.data(n = 400, n.pred = 5, n.gp = 101)
Y <- sim.data$Y
X <- sim.data$X
gp <- rep(list(seq(0, 1, length.out = 101)), 5) # grid points of Xs
model.tau <- rob.sf.reg(Y, X, emodel = "robust", fmodel = "tau", gp = gp)

[Package robflreg version 1.2 Index]