fregre.gsam.vs {fda.usc} | R Documentation |
Variable Selection using Functional Additive Models
Description
Computes functional GAM model between functional covariates
(X^1(t_1),\cdots,X^{q}(t_q))
and non functional covariates
(Z^1,...,Z^p)
with a scalar response Y
.
Usage
fregre.gsam.vs(
data = list(),
y,
include = "all",
exclude = "none",
family = gaussian(),
weights = NULL,
basis.x = NULL,
numbasis.opt = FALSE,
kbs,
dcor.min = 0.1,
alpha = 0.05,
par.model,
xydist,
trace = FALSE
)
Arguments
data |
List that containing the variables in the model.
"df" element is a data.frame containing the response and scalar covariates
(numeric and factors variables are allowed). Functional covariates of class
|
y |
Caracter string with the name of the scalar response variable. |
include |
vector with the name of variables to use. By default |
exclude |
vector with the name of variables to not use. By default |
family |
a description of the error distribution and link function to
be used in the model. This can be a character string naming a family
function, a family function or the result of a call to a family function.
(See |
weights |
weights |
basis.x |
Basis parameter options
|
numbasis.opt |
Logical, if |
kbs |
The dimension of the basis used to represent the smooth term. The default depends on the number of variables that the smooth is a function of. |
dcor.min |
Threshold for a variable to be entered into the model. X is discarded
if the distance correlation |
alpha |
Alpha value for testing the independence among covariate X and residual
e in previous steps. By default is |
par.model |
Model parameters. |
xydist |
List with the inner distance matrices of each variable (all potential covariates and the response). |
trace |
Interactive Tracing and Debugging of Call. |
Details
This function is an extension of the functional generalized spectral additive
regression models: fregre.gsam
where the E[Y|X,Z]
is related to the
linear prediction \eta
via a link function g(\cdot)
with integrated
smoothness estimation by the smooth functions f(\cdot)
.
E[Y|X,Z])=\eta=g^{-1}(\alpha+\sum_{i=1}^{p}f_{i}(Z^{i})+\sum_{k=1}^{q}\sum_{j=1}^{k_q}{f_{j}^{k}(\xi_j^k)})
where \xi_j^k
is the coefficient of the basis function expansion of
X^k
, (in PCA analysis \xi_j^k
is the score of the j
-functional
PC of X^k
.
The smooth functions f(\cdot)
can be added to the right hand side of the formula
to specify that the linear predictor depends on smooth functions of predictors using smooth
terms s
and te
as in gam
(or linear functionals of
these as Z\beta
and \big<X(t),\beta\big>
in fregre.glm
).
Value
Return an object corresponding to the estimated additive mdoel using
the selected variables (ame output as thefregre.gsam
function) and the following elements:
gof
, the goodness of fit for each step of VS algorithm.i.predictor
,vector
with 1 if the variable is selected, 0 otherwise.ipredictor
,vector
with the name of selected variables (in order of selection)dcor
, the value of distance correlation for each potential covariate and the residual of the model in each step.
Note
If the formula only contains a non functional explanatory variables (multivariate covariates),
the function compute a standard gam
procedure.
Author(s)
Manuel Feb-Bande, Manuel Oviedo de la Fuente manuel.oviedo@udc.es
References
Febrero-Bande, M., Gonz\'alez-Manteiga, W. and Oviedo de la Fuente, M. Variable selection in functional additive regression models, (2018). Computational Statistics, 1-19. DOI: doi:10.1007/s00180-018-0844-5
See Also
See Also as: predict.fregre.gsam
and summary.gam
.
Alternative methods: fregre.glm
, fregre.gsam
and fregre.gkam
.
Examples
## Not run:
data(tecator)
x=tecator$absorp.fdata
x1 <- fdata.deriv(x)
x2 <- fdata.deriv(x,nderiv=2)
y=tecator$y$Fat
xcat0 <- cut(rnorm(length(y)),4)
xcat1 <- cut(tecator$y$Protein,4)
xcat2 <- cut(tecator$y$Water,4)
ind <- 1:165
dat <- data.frame("Fat"=y, x1$data, xcat1, xcat2)
ldat <- ldata("df"=dat[ind,],"x"=x[ind,],"x1"=x1[ind,],"x2"=x2[ind,])
# 3 functionals (x,x1,x2), 3 factors (xcat0, xcat1, xcat2)
# and 100 scalars (impact poitns of x1)
# Time consuming
res.gam0 <- fregre.gsam.vs(data=ldat,y="Fat"
,exclude="x2",numbasis.opt=T) # All the covariates
summary(res.gam0)
res.gam0$ipredictors
res.gam1 <- fregre.gsam.vs(data=ldat,y="Fat") # All the covariates
summary(res.gam1)
res.gam1$ipredictors
covar <- c("xcat0","xcat1","xcat2","x","x1","x2")
res.gam2 <- fregre.gsam.vs(data=ldat, y="Fat", include=covar)
summary(res.gam2)
res.gam2$ipredictors
res.gam2$i.predictor
res.gam3 <- fregre.gsam.vs(data=ldat,y="Fat",
basis.x=c("type.basis"="pc","numbasis"=10))
summary(res.gam3)
res.gam3$ipredictors
res.gam4 <- fregre.gsam.vs(data=ldat,y="Fat",include=c("x","x1","x2"),
basis.x=c("type.basis"="pc","numbasis"=5),numbasis.opt=T)
summary(res.gam4)
res.gam4$ipredictors
lpc <- list("x"=create.pc.basis(ldat$x,1:4)
,"x1"=create.pc.basis(ldat$x1,1:3)
,"x2"=create.pc.basis(ldat$x2,1:12))
res.gam5 <- fregre.gsam.vs(data=ldat,y="Fat",basis.x=lpc)
summary(res.gam5)
res.gam6 <- fregre.gsam.vs(data=ldat,y="Fat",basis.x=lpc,numbasis.opt=T)
summary(res.gam6)
bsp <- create.fourier.basis(ldat$x$rangeval,7)
lbsp <- list("x"=bsp,"x1"=bsp,"x2"=bsp)
res.gam7 <- fregre.gsam.vs(data=ldat,y="Fat",basis.x=lbsp,kbs=4)
summary(res.gam7)
# Prediction like fregre.gsam()
newldat <- ldata("df"=dat[-ind,],"x"=x[-ind,],"x1"=x1[-ind,],
"x2"=x2[-ind,])
pred.gam1 <- predict(res.gam1,newldat)
pred.gam2 <- predict(res.gam2,newldat)
pred.gam3 <- predict(res.gam3,newldat)
pred.gam4 <- predict(res.gam4,newldat)
pred.gam5 <- predict(res.gam5,newldat)
pred.gam6 <- predict(res.gam6,newldat)
pred.gam7 <- predict(res.gam7,newldat)
plot(dat[-ind,"Fat"],pred.gam1)
points(dat[-ind,"Fat"],pred.gam2,col=2)
points(dat[-ind,"Fat"],pred.gam3,col=3)
points(dat[-ind,"Fat"],pred.gam4,col=4)
points(dat[-ind,"Fat"],pred.gam5,col=5)
points(dat[-ind,"Fat"],pred.gam6,col=6)
points(dat[-ind,"Fat"],pred.gam7,col=7)
pred2meas(newldat$df$Fat,pred.gam1)
pred2meas(newldat$df$Fat,pred.gam2)
pred2meas(newldat$df$Fat,pred.gam3)
pred2meas(newldat$df$Fat,pred.gam4)
pred2meas(newldat$df$Fat,pred.gam5)
pred2meas(newldat$df$Fat,pred.gam6)
pred2meas(newldat$df$Fat,pred.gam7)
## End(Not run)