R: Impact points selection of functional predictor and...

LMDC.select {fda.usc}

R Documentation

Impact points selection of functional predictor and regression using local maxima distance correlation (LMDC)

Description

LMDC.select function selects impact points of functional predictior using local maxima distance correlation (LMDC) for a scalar response given.
LMDC.regre function fits a multivariate regression method using the selected impact points like covariates for a scalar response.

Usage

LMDC.select(
  y,
  covar,
  data,
  tol = 0.06,
  pvalue = 0.05,
  plot = FALSE,
  local.dc = TRUE,
  smo = FALSE,
  verbose = FALSE
)

LMDC.regre(
  y,
  covar,
  data,
  newdata,
  pvalue = 0.05,
  method = "lm",
  par.method = NULL,
  plot = FALSE,
  verbose = FALSE
)

Arguments

`y`	name of the response variable.
`covar`	vector with the names of the covaviables (or points of impact) with length `p`.
`data`	data frame with length n rows and at least p + 1 columns, containing the scalar response and the potencial p covaviables (or points of impact) in the model.
`tol`	Tolerance value for distance correlation and imapct point.
`pvalue`	pvalue of bias corrected distance correlation t-test.
`plot`	logical value, if TRUE plots the distance correlation curve for each covariate in multivariate case and in each discretization points (argvals) in the functional case.
`local.dc`	Compute local distance correlation.
`smo`	logical. If TRUE, the curve of distance correlation computed in the impact points is smoothed using B-spline representation with a suitable number of basis elements.
`verbose`	print iterative and relevant steps of the procedure.
`newdata`	An optional data frame in which to look for variables with which to predict.
`method`	Name of regression method used, see details. This argument is used in do.call function like "what" argument.
`par.method`	List of parameters used to call the method. This argument is used in do.call function like "args" argument.

Details

String of characters corresponding to the name of the regression method called. Model available options:

"lm": Step-wise lm regression model (uses lm function, stats package). Recommended for linear models, test linearity using flm.test function.
"gam": Step-wise gam regression model (uses gam function, mgcv package). Recommended for non-linear models.

Models that use the indicated function of the required package:

"svm": Support vector machine (svm function, e1071 package).#'
"knn": k-nearest neighbor regression (knnn.reg function, FNN package).#'
"lars": Least Angle Regression using Lasso (lars function, lars package).
"glmnet": Lasso and Elastic-Net Regularized Generalized Linear Models (glmnet and cv.glmnet function, glmnet package).
"rpart": Recursive partitioning for regression a (rpart function, rpart package).
"flam": Fit the Fused Lasso Additive Model for a Sequence of Tuning Parameters (flam function, flam package).
"novas": NOnparametric VAriable Selection (code available in https://www.math.univ-toulouse.fr/~ferraty/SOFTWARES/NOVAS/novas-routines.R).
"cosso": Fit Regularized Nonparametric Regression Models Using COSSO Penalty (cosso function, cosso package).
"npreg": kernel regression estimate of a one (1) dimensional dependent variable on p-variate explanatory data (npreg function, np package).
"mars": Multivariate adaptive regression splines (mars function, mda package).
"nnet": Fit Neural Networks (nnet function, nnet package).
"lars": Fits Least Angle Regression, Lasso and Infinitesimal Forward Stagewise regression models (lars function, lars package).

Value

LMDC.select function return a list of two elements:

cor the value of distance correlation for each covariate.
maxLocal index or locations of local maxima distance correlations.

LMDC.regre function return a list of folowing elements:

model object corresponding to the estimated method using the selected variables
xvar names of selected variables (impact points).
edf Effective Degrees of Freedom.
nvarNumber of selected variables (impact points).

Author(s)

Manuel Oviedo de la Fuente manuel.oviedo@udc.es

References

Ordonez, C., Oviedo de la Fuente, M., Roca-Pardinas, J., Rodriguez-Perez, J. R. (2018). Determining optimum wavelengths for leaf water content estimation from reflectance: A distance correlation approach. Chemometrics and Intelligent Laboratory Systems. 173,41-50 doi:10.1016/j.chemolab.2017.12.001.

Examples

## Not run: 
data(tecator)
absorp=fdata.deriv(tecator$absorp.fdata,2)
ind=1:129
x=absorp[ind,]
y=tecator$y$Fat[ind]
newx=absorp[-ind,]
newy=tecator$y$Fat[-ind]

## Functional PC regression
res.pc=fregre.pc(x,y,1:6)
pred.pc=predict(res.pc,newx)

# Functional regression with basis representation
res.basis=fregre.basis.cv(x,y)
pred.basis=predict(res.basis[[1]],newx)

# Functional nonparametric regression
res.np=fregre.np.cv(x,y)
pred.np=predict(res.np,newx)

dat    <- data.frame("y"=y,x$data)
newdat <- data.frame("y"=newy,newx$data)

res.gam=fregre.gsam(y~s(x),data=list("df"=dat,"x"=x))
pred.gam=predict(res.gam,list("x"=newx))

dc.raw <- LMDC.select("y",data=dat, tol = 0.05, pvalue= 0.05,
                      plot=F, smo=T,verbose=F)
covar <- paste("X",dc.raw$maxLocal,sep="")                      
# Preselected design/impact points 
covar
ftest<-flm.test(dat[,-1],dat[,"y"], B=500, verbose=F,
    plot.it=F,type.basis="pc",est.method="pc",p=4,G=50)
    
if (ftest$p.value>0.05) { 
  # Linear relationship, step-wise lm is recommended
  out <- LMDC.regre("y",covar,dat,newdat,pvalue=.05,
              method ="lm",plot=F,verbose=F)
} else {
 # Non-Linear relationship, step-wise gam is recommended
  out <- LMDC.regre("y",covar,dat,newdat,pvalue=.05,
              method ="gam",plot=F,verbose=F) }  
             
# Final  design/impact points
out$xvar

# Predictions
mean((newy-pred.pc)^2)                
mean((newy-pred.basis)^2) 
mean((newy-pred.np)^2)              
mean((newy-pred.gam)^2) 
mean((newy-out$pred)^2)

## End(Not run)

[Package fda.usc version 2.1.0 Index]