R: Goodness-of fit test for the functional linear model using...

rp.flm.test {fda.usc}

R Documentation

Goodness-of fit test for the functional linear model using random projections

Description

Tests the composite null hypothesis of a Functional Linear Model with scalar response (FLM),

H_0:\,Y=\langle X,\beta\rangle+\epsilon\quad\mathrm{vs}\quad H_1:\,Y\neq\langle X,\beta\rangle+\epsilon.

If \beta=\beta_0 is provided, then the simple hypothesis H_0:\,Y=\langle X,\beta_0\rangle+\epsilon is tested. The way of testing the null hypothesis is via a norm (Cramer-von Mises or Kolmogorov-Smirnov) in the empirical process indexed by the projections.

No NA's are allowed neither in the functional covariate nor in the scalar response.

Usage

rp.flm.test(
  X.fdata,
  Y,
  beta0.fdata = NULL,
  B = 1000,
  n.proj = 10,
  est.method = "pc",
  p = NULL,
  p.criterion = "SICc",
  pmax = 20,
  type.basis = "bspline",
  projs = 0.95,
  verbose = TRUE,
  same.rwild = FALSE,
  ...
)

Arguments

`X.fdata`	functional observations in the class `fdata`.
`Y`	scalar responses for the FLM. Must be a vector with the same number of elements as functions are in `X.fdata`.
`beta0.fdata`	functional parameter for the simple null hypothesis, in the `fdata` class. The `argvals` and `rangeval` arguments of `beta0.fdata` must be the same of `X.fdata`. If `beta0.fdata=NULL` (default), the function will test for the composite null hypothesis.
`B`	number of bootstrap replicates to calibrate the distribution of the test statistic.
`n.proj`	vector with the number of projections to consider.
`est.method`	estimation method for `\beta`, only used in the composite case. There are three methods: list("\"pc\"") if `p` is given, then `\beta` is estimated by `fregre.pc`. Otherwise, `p` is chosen using `fregre.pc.cv` and the `p.criterion` criterion. list("\"pls\"") if `p` is given, `\beta` is estimated by `fregre.pls`. Otherwise, `p` is chosen using `fregre.pls.cv` and the `p.criterion` criterion. list("\"basis\"") if `p` is given, `\beta` is estimated by `fregre.basis`. Otherwise, `p` is' chosen using `fregre.basis.cv` and the `p.criterion` criterion. Both in `fregre.basis` and `fregre.basis.cv`, the same basis for `basis.x` and `basis.b` is considered.
`p`	number of elements for the basis representation of `beta0.fdata` and `X.fdata` with the `est.method` (only composite hypothesis). If not supplied, it is estimated from the data.
`p.criterion`	for `est.method` equal to `"pc"` or `"pls"`, either `"SIC"`, `"SICc"` or one of the criterions described in `fregre.pc.cv`. For `"basis"` a value for `type.CV` in `fregre.basis.cv` such as `GCV.S`.
`pmax`	maximum size of the basis expansion to consider in when using `p.criterion`.
`type.basis`	type of basis if `est.method = "basis"`.
`projs`	a `fdata` object containing the random directions employed to project `X.fdata`. If numeric, the convenient value for `ncomp` in `rdir.pc`.
`verbose`	whether to show or not information about the testing progress.
`same.rwild`	wether to employ the same wild bootstrap residuals for different projections or not.
`...`	further arguments passed to create.basis (not `rangeval` that is taken as the `rangeval` of `X.fdata`).

Value

An object with class "htest" whose underlying structure is a list containing the following components:

list("p.values.fdr") a matrix of size c(n.proj, 2), containing in each row the FDR p-values of the CvM and KS tests up to that projection.
list("proj.statistics") a matrix of size c(max(n.proj), 2) with the value of the test statistic on each projection.
list("boot.proj.statistics") an array of size c(max(n.proj), 2, B) with the values of the bootstrap test statistics for each projection.
list("proj.p.values") a matrix of size c(max(n.proj), 2)
list("method") information about the test performed and the kind of estimation performed.
list("B") number of bootstrap replicates used.
list("n.proj") number of projections specified
list("projs") random directions employed to project X.fdata.
list("type.basis") type of basis for est.method = "basis".
list("beta.est") estimated functional parameter \hat \beta in the composite hypothesis. For the simple hypothesis, beta0.fdata.
list("p") number of basis elements considered for estimation of \beta.
list("p.criterion") criterion employed for selecting p.
list("data.name") the character string "Y = <X, b> + e"

Author(s)

Eduardo Garcia-Portugues (edgarcia@est-econ.uc3m.es) and Manuel Febrero-Bande (manuel.febrero@usc.es).

References

Cuesta-Albertos, J.A., Garcia-Portugues, E., Febrero-Bande, M. and Gonzalez-Manteiga, W. (2017). Goodness-of-fit tests for the functional linear model based on randomly projected empirical processes. arXiv:1701.08363. https://arxiv.org/abs/1701.08363

Garcia-Portugues, E., Gonzalez-Manteiga, W. and Febrero-Bande, M. (2014). A goodness-of-fit test for the functional linear model with scalar response. Journal of Computational and Graphical Statistics, 23(3), 761–778. doi:10.1080/10618600.2013.812519

Examples

## Not run: 
# Simulated example

set.seed(345678)
t <- seq(0, 1, l = 101)
n <- 100
X <- r.ou(n = n, t = t, alpha = 2, sigma = 0.5)
beta0 <- fdata(mdata = cos(2 * pi * t) - (t - 0.5)^2, argvals = t,
               rangeval = c(0,1))
Y <- inprod.fdata(X, beta0) + rnorm(n, sd = 0.1)

# Test all cases
rp.flm.test(X.fdata = X, Y = Y, est.method = "pc")
rp.flm.test(X.fdata = X, Y = Y, est.method = "pls")
rp.flm.test(X.fdata = X, Y = Y, est.method = "basis", 
            p.criterion = fda.usc.devel::GCV.S)
rp.flm.test(X.fdata = X, Y = Y, est.method = "pc", p = 5)
rp.flm.test(X.fdata = X, Y = Y, est.method = "pls", p = 5)
rp.flm.test(X.fdata = X, Y = Y, est.method = "basis", p = 5)
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0)

# Composite hypothesis: do not reject FLM
rp.test <- rp.flm.test(X.fdata = X, Y = Y, est.method = "pc")
rp.test$p.values.fdr
pcvm.test <- flm.test(X.fdata = X, Y = Y, est.method = "pc", B = 1e3,
                      plot.it = FALSE)
pcvm.test

# Estimation of beta
par(mfrow = c(1, 3))
plot(X, main = "X")
plot(beta0, main = "beta")
lines(rp.test$beta.est, col = 2)
lines(pcvm.test$beta.est, col = 3)
plot(density(Y), main = "Density of Y", xlab = "Y", ylab = "Density")
rug(Y)

# Simple hypothesis: do not reject beta = beta0
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0)$p.values.fdr
flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0, B = 1e3, plot.it = FALSE)

# Simple hypothesis: reject beta = beta0^2
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0^2)$p.values.fdr
flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0^2, B = 1e3, plot.it = FALSE)

# Tecator dataset

# Load data
data(tecator)
absorp <- tecator$absorp.fdata
ind <- 1:129 # or ind <- 1:215
x <- absorp[ind, ]
y <- tecator$y$Fat[ind]

# Composite hypothesis
rp.tecat <- rp.flm.test(X.fdata = x, Y = y, est.method = "pc")
pcvm.tecat <- flm.test(X.fdata = x, Y = y, est.method = "pc", B = 1e3,
                       plot.it = FALSE)
rp.tecat$p.values.fdr[c(5, 10), ]
pcvm.tecat

# Simple hypothesis
zero <- fdata(mdata = rep(0, length(x$argvals)), argvals = x$argvals,
              rangeval = x$rangeval)
rp.flm.test(X.fdata = x, Y = y, beta0.fdata = zero)
flm.test(X.fdata = x, Y = y, beta0.fdata = zero, B = 1e3)

# With derivatives
rp.tecat <- rp.flm.test(X.fdata = fdata.deriv(x, 1), Y = y, est.method = "pc")
rp.tecat$p.values.fdr
rp.tecat <- rp.flm.test(X.fdata = fdata.deriv(x, 2), Y = y, est.method = "pc")
rp.tecat$p.values.fdr

# AEMET dataset

# Load data
data(aemet)
wind.speed <- apply(aemet$wind.speed$data, 1, mean)
temp <- aemet$temp

# Remove the 5% of the curves with less depth (i.e. 4 curves)
par(mfrow = c(1, 1))
res.FM <- depth.FM(temp, draw = TRUE)
qu <- quantile(res.FM$dep, prob = 0.05)
l <- which(res.FM$dep <= qu)
lines(aemet$temp[l], col = 3)

# Data without outliers
wind.speed <- wind.speed[-l]
temp <- temp[-l]

# Composite hypothesis
rp.aemet <- rp.flm.test(X.fdata = temp, Y = wind.speed, est.method = "pc")
pcvm.aemet <- flm.test(X.fdata = temp, Y = wind.speed, B = 1e3,
                       est.method = "pc", plot.it = FALSE)
rp.aemet$p.values.fdr
apply(rp.aemet$p.values.fdr, 2, range)
pcvm.aemet

# Simple hypothesis
zero <- fdata(mdata = rep(0, length(temp$argvals)), argvals = temp$argvals,
              rangeval = temp$rangeval)
flm.test(X.fdata = temp, Y = wind.speed, beta0.fdata = zero, B = 1e3,
         plot.it = FALSE)
rp.flm.test(X.fdata = temp, Y = wind.speed, beta0.fdata = zero)

## End(Not run)

[Package fda.usc version 2.1.0 Index]