flm.test {fda.usc} | R Documentation |
Goodness-of-fit test for the Functional Linear Model with scalar response
Description
The function flm.test
tests the composite null hypothesis of
a Functional Linear Model with scalar response (FLM),
H_0:\,Y=\big<X,\beta\big>+\epsilon,
versus
a general alternative. If \beta=\beta_0
is provided, then the
simple hypothesis H_0:\,Y=\big<X,\beta_0\big>+\epsilon
is tested.
The testing of the null hypothesis is done by a Projected Cramer-von Mises statistic (see Details).
Usage
flm.test(
X.fdata,
Y,
beta0.fdata = NULL,
B = 5000,
est.method = "pls",
p = NULL,
type.basis = "bspline",
verbose = TRUE,
plot.it = TRUE,
B.plot = 100,
G = 200,
...
)
Arguments
X.fdata |
Functional covariate for the FLM. The object must be in the class
|
Y |
Scalar response for the FLM. Must be a vector with the same number of elements
as functions are in |
beta0.fdata |
Functional parameter for the simple null hypothesis, in the |
B |
Number of bootstrap replicates to calibrate the distribution of the test statistic.
|
est.method |
Estimation method for the unknown parameter
|
p |
Number of elements of the basis considered. If it is not given, an optimal |
type.basis |
Type of basis used to represent the functional process. Depending on the hypothesis it will have a different interpretation:
|
verbose |
Either to show or not information about computing progress. |
plot.it |
Either to show or not a graph of the observed trajectory,
and the bootstrap trajectories under the null composite hypothesis, of the
process |
B.plot |
Number of bootstrap trajectories to show in the resulting plot of the test.
As the trajectories shown are the first |
G |
Number of projections used to compute the trajectories of the process
|
... |
Further arguments passed to |
Details
The Functional Linear Model with scalar response (FLM), is defined as
Y=\big<X,\beta\big>+\epsilon
, for a functional process
X
such that E[X(t)]=0
, E[X(t)\epsilon]=0
for all t
and for a scalar variable Y
such that E[Y]=0
.
Then, the test assumes that Y
and X.fdata
are centred and will automatically
center them. So, bear in mind that when you apply the test for Y
and X.fdata
,
actually, you are applying it to Y-mean(Y)
and fdata.cen(X.fdata)$Xcen
.
The test statistic corresponds to the Cramer-von Mises norm of the Residual Marked
empirical Process based on Projections R_n(u,\gamma)
defined in
Garcia-Portugues et al. (2014).
The expression of this process in a p
-truncated basis of the space L^2[0,T]
leads to the p
-multivariate process R_{n,p}\big(u,\gamma^{(p)}\big)
,
whose Cramer-von Mises norm is computed.
The choice of an appropriate p
to represent the functional process X
,
in case that is not provided, is done via the estimation of \beta
for the composite
hypothesis. For the simple hypothesis, as no estimation of \beta
is done, the choice
of p
depends only on the functional process X
. As the result of the test may
change for different p
's, we recommend to use an automatic criterion to select p
instead of provide a fixed one.
The distribution of the test statistic is approximated by a wild bootstrap resampling on the
residuals, using the golden section bootstrap.
Finally, the graph shown if plot.it=TRUE
represents the observed trajectory, and the
bootstrap trajectories under the null, of the process RMPP integrated on the projections:
R_n(u)\approx\frac{1}{G}\sum_{g=1}^G R_n(u,\gamma_g),
where \gamma_g
are simulated as Gaussians processes. This gives a graphical idea of
how distant is the observed trajectory from the null hypothesis.
Value
An object with class "htest"
whose underlying structure is a list containing
the following components:
-
statistic The value of the test statistic.
-
boot.statistics A vector of length
B
with the values of the bootstrap test statistics. -
p.value The p-value of the test.
-
method The method used.
-
B The number of bootstrap replicates used.
-
type.basis The type of basis used.
-
beta.est The estimated functional parameter
\beta
in the composite hypothesis. For the simple hypothesis, the givenbeta0.fdata
. -
p The number of basis elements passed or automatically chosen.
-
ord The optimal order for PC and PLS given by
fregre.pc.cv
andfregre.pls.cv
. For other methods is setted to1:p
. -
data.name The character string "Y=<X,b>+e"
Note
No NA's are allowed neither in the functional covariate nor in the scalar response.
Author(s)
Eduardo Garcia-Portugues. Please, report bugs and suggestions to edgarcia@est-econ.uc3m.es
References
Escanciano, J. C. (2006). A consistent diagnostic test for regression models using projections. Econometric Theory, 22, 1030-1051. doi:10.1017/S0266466606060506
Garcia-Portugues, E., Gonzalez-Manteiga, W. and Febrero-Bande, M. (2014). A goodness–of–fit test for the functional linear model with scalar response. Journal of Computational and Graphical Statistics, 23(3), 761-778. doi:10.1080/10618600.2013.812519
See Also
Adot
, PCvM.statistic
, rwild
,
flm.Ftest
, dfv.test
,
fregre.pc
, fregre.pls
, fregre.basis
,
fregre.pc.cv
, fregre.pls.cv
,
fregre.basis.cv
, optim.basis
,
create.basis
Examples
# Simulated example #
X=rproc2fdata(n=100,t=seq(0,1,l=101),sigma="OU")
beta0=fdata(mdata=cos(2*pi*seq(0,1,l=101))-(seq(0,1,l=101)-0.5)^2+
rnorm(101,sd=0.05),argvals=seq(0,1,l=101),rangeval=c(0,1))
Y=inprod.fdata(X,beta0)+rnorm(100,sd=0.1)
dev.new(width=21,height=7)
par(mfrow=c(1,3))
plot(X,main="X")
plot(beta0,main="beta0")
plot(density(Y),main="Density of Y",xlab="Y",ylab="Density")
rug(Y)
## Not run:
# Composite hypothesis: do not reject FLM
pcvm.sim=flm.test(X,Y,B=50,B.plot=50,G=100,plot.it=TRUE)
pcvm.sim
flm.test(X,Y,B=5000)
# Estimated beta
dev.new()
plot(pcvm.sim$beta.est)
# Simple hypothesis: do not reject beta=beta0
flm.test(X,Y,beta0.fdata=beta0,B=50,B.plot=50,G=100)
flm.test(X,Y,beta0.fdata=beta0,B=5000)
# AEMET dataset #
data(aemet)
# Remove the 5\
dev.new()
res.FM=depth.FM(aemet$temp,draw=TRUE)
qu=quantile(res.FM$dep,prob=0.05)
l=which(res.FM$dep<=qu)
lines(aemet$temp[l],col=3)
aemet$df$name[l]
# Data without outliers
wind.speed=apply(aemet$wind.speed$data,1,mean)[-l]
temp=aemet$temp[-l]
# Exploratory analysis: accept the FLM
pcvm.aemet=flm.test(temp,wind.speed,est.method="pls",B=100,B.plot=50,G=100)
pcvm.aemet
# Estimated beta
dev.new()
plot(pcvm.aemet$beta.est,lwd=2,col=2)
# B=5000 for more precision on calibration of the test: also accept the FLM
flm.test(temp,wind.speed,est.method="pls",B=5000)
# Simple hypothesis: rejection of beta0=0? Limiting p-value...
dat=rep(0,length(temp$argvals))
flm.test(temp,wind.speed, beta0.fdata=fdata(mdata=dat,argvals=temp$argvals,
rangeval=temp$rangeval),B=100)
flm.test(temp,wind.speed, beta0.fdata=fdata(mdata=dat,argvals=temp$argvals,
rangeval=temp$rangeval),B=5000)
# Tecator dataset #
data(tecator)
names(tecator)
absorp=tecator$absorp.fdata
ind=1:129 # or ind=1:215
x=absorp[ind,]
y=tecator$y$Fat[ind]
tt=absorp[["argvals"]]
# Exploratory analysis for composite hypothesis with automatic choose of p
pcvm.tecat=flm.test(x,y,B=100,B.plot=50,G=100)
pcvm.tecat
# B=5000 for more precision on calibration of the test: also reject the FLM
flm.test(x,y,B=5000)
# Distribution of the PCvM statistic
plot(density(pcvm.tecat$boot.statistics),lwd=2,xlim=c(0,10),
main="PCvM distribution", xlab="PCvM*",ylab="Density")
rug(pcvm.tecat$boot.statistics)
abline(v=pcvm.tecat$statistic,col=2,lwd=2)
legend("top",legend=c("PCvM observed"),lwd=2,col=2)
# Simple hypothesis: fixed p
dat=rep(0,length(x$argvals))
flm.test(x,y,beta0.fdata=fdata(mdata=dat,argvals=x$argvals,
rangeval=x$rangeval),B=100,p=11)
# Simple hypothesis, automatic choose of p
flm.test(x,y,beta0.fdata=fdata(mdata=dat,argvals=x$argvals,
rangeval=x$rangeval),B=100)
flm.test(x,y,beta0.fdata=fdata(mdata=dat,argvals=x$argvals,
rangeval=x$rangeval),B=5000)
## End(Not run)