plsim.vs.hard {PLSiMCpp}R Documentation

Variable Selection for Partial Linear Single Index Models

Description

Variable Selection based on AIC, BIC, SCAD, LASSO and Elastic Net. The methods based on SCAD, LASSO and Elastic Net are implemented with Penalized Profile Least Squares Estimator, while AIC and BIC are implemented with Stepwise Regression.

Usage

plsim.vs.hard(...)

## S3 method for class 'formula'
plsim.vs.hard(formula, data, ...)

## Default S3 method:
plsim.vs.hard(xdat=NULL, zdat, ydat, h=NULL, zeta_i=NULL, 
lambdaList=NULL, l1RatioList=NULL, lambda_selector="BIC", threshold=0.05,
Method="SCAD", verbose=TRUE, ParmaSelMethod="SimpleValidation", seed=0, ...)

Arguments

...

additional arguments.

formula

a symbolic description of the model to be fitted.

data

an optional data frame, list or environment containing the variables in the model.

xdat

input matrix (linear covariates). The model reduces to a single index model when x is NULL.

zdat

input matrix (nonlinear covariates). z should not be NULL.

ydat

input vector (response variable).

h

a numerical value or a vector for bandwidth. If h is NULL, a default vector c(0.01,0.02,0.05,0.1,0.5) will be set for it. plsim.bw is employed to select the optimal bandwidth when h is a vector or NULL.

zeta_i

initial coefficients, optional (default: NULL). It could be obtained by the function plsim.ini. zeta_i[1:ncol(z)] is the initial coefficient vector \alpha_0, and zeta_i[(ncol(z)+1):(ncol(z)+ncol(x))] is the initial coefficient vector \beta_0.

verbose

bool, default: TRUE. Enable verbose output.

Method

variable selection method, default: "SCAD". It could be "SCAD", "LASSO", "ElasticNet", "AIC" or "BIC".

lambdaList

the parameter for the function plsim.lam, default: "NULL".

l1RatioList

the parameter for the function plsim.lam, default: "NULL".

lambda_selector

the parameter for the function plsim.lam, default: "BIC".

threshold

the threshold to select important variable according to the estimated coefficients.

ParmaSelMethod

the parameter for the function plsim.bw.

seed

int, default: 0.

Value

alpha_varSel

selected variables in z.

beta_varSel

selected variables in x.

fit_plsimest

fit_plsimest is not NULL when h is a vector or NULL. For each bandwidth, plsim.est is employed to integrate selected variabels. Finally, the optimal fitted model will be selected according to BIC.

Examples


# EXAMPLE 1 (INTERFACE=FORMULA)
# To select variables with Penalized Profile Least Squares Estimation based on 
# the penalty LASSO.

n = 50
dx = 10
dz = 5
sigma = 0.2
alpha = matrix(c(1,3,1.5,0.5,0),dz,1)
alpha = alpha/norm(alpha,"2")
beta = matrix(c(3,2,0,0,0,1.5,0,0.2,0.3,0.15),dx,1)

A = sqrt(3)/2-1.645/sqrt(12)
B = sqrt(3)/2+1.645/sqrt(12)
z = matrix(runif(n*dz),n,dz)
x = matrix(runif(n*dx),n,dx)
y = sin( (z%*%alpha - A) * 3.1415926 * (B-A) ) + x%*%beta + sigma*matrix(rnorm(n),n,1)

# Variable Selectioin Based on LASSO
res_varSel_LASSO = plsim.vs.hard(y~x|z,h=0.1,Method="LASSO")


# EXAMPLE 2 (INTERFACE=DATA FRAME)
# To select variables with Penalized Profile Least Squares Estimation based on 
# the penalty LASSO.

n = 50
dx = 10
dz = 5
sigma = 0.2
alpha = matrix(c(1,3,1.5,0.5,0),dz,1)
alpha = alpha/norm(alpha,"2")
beta = matrix(c(3,2,0,0,0,1.5,0,0.2,0.3,0.15),dx,1)

A = sqrt(3)/2-1.645/sqrt(12)
B = sqrt(3)/2+1.645/sqrt(12)
z = matrix(runif(n*dz),n,dz)
x = matrix(runif(n*dx),n,dx)
y = sin( (z%*%alpha - A) * 3.1415926 * (B-A) ) + x%*%beta + sigma*matrix(rnorm(n),n,1)

Z = data.frame(z)
X = data.frame(x)

# Variable Selectioin Based on LASSO
res_varSel_LASSO = plsim.vs.hard(xdat=X,zdat=Z,ydat=y,h=0.1,Method="LASSO")


[Package PLSiMCpp version 1.0.4 Index]