SSANOVAwt {cosso} R Documentation

Compute adaptive weights by fitting a SS-ANOVA model

Description

A preliminary estimate \tilde{\eta} is first obtained by fitting a smoothing spline ANOVA model, and then use the inverse L_2-norm, ||\tilde{\eta}_j||^{-\gamma}, as the initial weight for the j-th functional component.

Usage

 SSANOVAwt(x,y,tau,family=c("Gaussian","Binomial","Cox","Quantile"),mscale=rep(1,ncol(x)),
gamma=1,scale=FALSE,nbasis,basis.id,cpus) 

Arguments

 x input matrix; the number of rows is sample size, the number of columns is the data dimension. The range of input variables is scaled to [0,1] for continuous variables. y response vector. Quantitative for family="Gaussian" or family="Quantile". For family="Binomial" should be a vector with two levels. For family="Cox", y should be a two-column matrix (data frame) with columns named 'time' and 'status' tau the quantile to be estimated, a number strictly between 0 and 1. Arguement required when family="Quantile". family response type. Abbreviations are allowed. mscale scale parameter for the Gram matrix associated with each function component. Default is rep(1,ncol(x)) gamma power of inverse L_2-norm. Default is 1. scale if TRUE, continuous predictors will be rescaled to [0,1] interval. Dafault is FALSE. nbasis number of "knots" to be selected. Ignored when basis.id is provided. basis.id index designating selected "knots". Arguement is not valid if family="Quantile". cpus number of available processor units. Default is 1. If cpus>=2, parallelize task using "parallel" package. Recommended when either sample size or number of covariates is large. Arguement is not valid if family="Gaussian" or family="Binomial".

Details

The initial mean function is estimated via a smooothing spline objective function. In the SS-ANOVA model framework, the regression function is assumed to have an additive form

\eta(x)=b+\sum_{j=1}^p\eta_j(x^{(j)}),

where b denotes intercept and \eta_j denotes the main effect of the j-th covariate.

For "Gaussian" response, the mean regression function is estimated by minimizing the objective function:

\sum_i(y_i-\eta(x_i))^2/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.

where RSS is residual sum of squares.

For "Binomial" response, the regression function is estimated by minimizing the objective function:

-log-likelihood/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2

For "Quantile" regression model, the quantile function, is estimated by minimizing the objective function:

\sum_i\rho(y_i-\eta(x_i))/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.

For "Cox" regression model, the log-hazard function, is estimated by minimizing the objective function:

-log-Partial Likelihood/nobs+\lambda_0\sum_{j=1}^p\alpha_j||\eta_j||^2.

The smoothing parameter \lambda_0 is tuned by 5-fold Cross-Validation, if family="Gaussian", "Binomial" or "Quantile", and Approximate Cross-Validation, if family="Cox". But the smoothing parameters \alpha_j are given in the arguement mscale.

The adaptive weights are then fiven by ||\tilde{\eta}_j||^{-\gamma}.

Value

 wt The adaptive weights.

Author(s)

Hao Helen Zhang and Chen-Yen Lin

References

Storlie, C. B., Bondell, H. D., Reich, B. J. and Zhang, H. H. (2011) "Surface Estimation, Variable Selection, and the Nonparametric Oracle Property", Statistica Sinica, 21, 679–705.

Examples

## Adaptive COSSO Model
## Binomial
set.seed(20130310)
x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7))
trueProb=1/(1+exp(-x[,1]-sin(2*pi*x[,2])-5*(x[,4]-0.4)^2))
y=rbinom(200,1,trueProb)

Binomial.wt=SSANOVAwt(x,y,family="Bin")

## Not run:
## Gaussian
set.seed(20130310)
x=cbind(rbinom(200,1,.7),matrix(runif(200*7,0,1),nc=7))
y=x[,1]+sin(2*pi*x[,2])+5*(x[,4]-0.4)^2+rnorm(200,0,1)
Gaussian.wt=SSANOVAwt(designx,response,family="Gau")