R: Compute adaptive weights by fitting a SS-ANOVA model

SSANOVAwt {cosso}

R Documentation

Compute adaptive weights by fitting a SS-ANOVA model

Description

A preliminary estimate \tilde{\eta} is first obtained by fitting a smoothing spline ANOVA model, and then use the inverse L_2-norm, ||\tilde{\eta}_j||^{-\gamma}, as the initial weight for the j-th functional component.

Usage

 SSANOVAwt(x,y,tau,family=c("Gaussian","Binomial","Cox","Quantile"),mscale=rep(1,ncol(x)),
               gamma=1,scale=FALSE,nbasis,basis.id,cpus)

Arguments

`x`	input matrix; the number of rows is sample size, the number of columns is the data dimension. The range of input variables is scaled to [0,1] for continuous variables.
`y`	response vector. Quantitative for `family="Gaussian"` or `family="Quantile"`. For `family="Binomial"` should be a vector with two levels. For `family="Cox"`, y should be a two-column matrix (data frame) with columns named 'time' and 'status'
`tau`	the quantile to be estimated, a number strictly between 0 and 1. Argument required when `family="Quantile"`.
`family`	response type. Abbreviations are allowed.
`mscale`	scale parameter for the Gram matrix associated with each function component. Default is `rep(1,ncol(x))`
`gamma`	power of inverse `L_2`-norm. Default is `1`.
`scale`	if `TRUE`, continuous predictors will be rescaled to [0,1] interval. Default is `FALSE`.
`nbasis`	number of "knots" to be selected. Ignored when `basis.id` is provided.
`basis.id`	index designating selected "knots". Argument is not valid if `family="Quantile"`.
`cpus`	number of available processor units. Default is `1`. If `cpus`>=2, parallelize task using "parallel" package. Recommended when either sample size or number of covariates is large. Argument is not valid if `family="Gaussian"` or `family="Binomial"`.

Details

The initial mean function is estimated via a smooothing spline objective function. In the SS-ANOVA model framework, the regression function is assumed to have an additive form

\eta(x)=b+\sum_{j=1}^p\eta_j(x^{(j)}),

where b denotes intercept and \eta_j denotes the main effect of the j-th covariate.

For "Gaussian" response, the mean regression function is estimated by minimizing the objective function:

\sum_i(y_i-\eta(x_i))^2/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.

where RSS is residual sum of squares.

For "Binomial" response, the regression function is estimated by minimizing the objective function:

-log-likelihood/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2

For "Quantile" regression model, the quantile function, is estimated by minimizing the objective function:

\sum_i\rho(y_i-\eta(x_i))/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.

For "Cox" regression model, the log-hazard function, is estimated by minimizing the objective function:

-log-Partial Likelihood/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.

The smoothing parameter \lambda_0 is tuned by 5-fold Cross-Validation, if family="Gaussian", "Binomial" or "Quantile", and Approximate Cross-Validation, if family="Cox". But the smoothing parameters \alpha_j are given in the argument mscale.

The adaptive weights are then fiven by ||\tilde{\eta}_j||^{-\gamma}.

Value

`wt`	The adaptive weights.

Author(s)

Hao Helen Zhang and Chen-Yen Lin

References

Storlie, C. B., Bondell, H. D., Reich, B. J. and Zhang, H. H. (2011) "Surface Estimation, Variable Selection, and the Nonparametric Oracle Property", Statistica Sinica, 21, 679–705.

Examples

## Adaptive COSSO Model
## Binomial
set.seed(20130310)
x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7))
trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2))
y=rbinom(200,1,trueProb)

Binomial.wt=SSANOVAwt(x,y,family="Bin")
ada.B.Obj=cosso(x,y,wt=Binomial.wt,family="Bin")

## Not run: 
## Gaussian
set.seed(20130310)
x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7))
y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1)
Gaussian.wt=SSANOVAwt(designx,response,family="Gau")
ada.G.Obj=cosso(x,y,wt=Gaussian.wt,family="Gaussian")

## End(Not run)

[Package cosso version 2.1-2 Index]