R: msgps (Degrees of Freedom of Elastic Net, Adaptive Lasso and...

msgps {msgps}

R Documentation

msgps (Degrees of Freedom of Elastic Net, Adaptive Lasso and Generalized Elastic Net)

Description

This package computes the degrees of freedom of the lasso, elastic net, generalized elastic net and adaptive lasso based on the generalized path seeking algorithm. The optimal model can be selected by model selection criteria including Mallows' Cp, bias-corrected AIC (AICc), generalized cross validation (GCV) and BIC.

Usage

msgps(X,y,penalty="enet", alpha=0, gamma=1, lambda=0.001, tau2, STEP=20000, 
STEP.max=200000,  DFtype="MODIFIED",  p.max=300, intercept=TRUE, stand.coef=FALSE)

Arguments

`X`	predictor matrix
`y`	response vector
`penalty`	The penalty term. The `"enet"` indicates the elastic net: `\alpha/2\|\|\beta\|\|_2^2+(1-\alpha)\|\|\beta\|\|_1.` Note that `alpha=0` is the lasso penalty. The `"genet"` is the generalized elastic net: `log(\alpha+(1-\alpha)\|\|\beta\|\|_1).` The `"alasso"` is the adaptive lasso, which is a weighted version of the lasso given by `w_i\|\|\beta\|\|_1,` where `w_i` is `1/(\hat{\beta}_i)^{\gamma}`. Here `\gamma>0` is a tuning parameter, and `\hat{\beta}_i` is the ridge estimate with regularization parameter being `\lambda \ge 0`.
`alpha`	The value of `\alpha` on `"enet"` and `"genet"` penalty.
`gamma`	The value of `\gamma` on `"alasso"`.
`lambda`	The value of regularization parameter `\lambda \ge 0` for ridge regression, which is used to calculate the weight vector of `"alasso"` penalty. Note that the ridge estimates can be ordinary least squared estimates when `lambda=0`.
`tau2`	Estimator of error variance for Mallows' Cp. The default is the unbiased estimator of error vairance of the most complex model. When the unbiased estimator of error vairance of the most complex model is not available (e.g., the number of variables exceeds the number of samples), `tau2` is the variance of response vector.
`STEP`	The approximate number of steps.
`STEP.max`	The number of steps in this algorithm can often exceed `STEP`. When the number of steps exceeds `STEP.max`, this algorithm stops.
`DFtype`	`"MODIFIED"` or `"NAIVE"`. The `"MODIFIED"` update is much more efficient thatn `"NAIVE"` update.
`p.max`	If the number of selected variables exceeds `p.max`, the algorithm stops.
`intercept`	When intercept is `TRUE`, the result of intercept is included.
`stand.coef`	When stand.coef is `TRUE`, the standardized coefficient is displayed.

Author(s)

Kei Hirose
mail@keihirose.com

References

Friedman, J. (2008). Fast sparse regression and classification. Technical report, Standford University.
Hirose, K., Tateishi, S. and Konishi, S.. (2011). Efficient algorithm to select tuning parameters in sparse regression modeling with regularization. arXiv:1109.2411 (arXiv).

Examples

#data
X <- matrix(rnorm(100*8),100,8)
beta0 <- c(3,1.5,0,0,2,0,0,0)
epsilon <- rnorm(100,sd=3)
y <- X %*% beta0 + epsilon
y <- c(y)

#lasso
fit <- msgps(X,y)
summary(fit) 
coef(fit) #extract coefficients at t selected by model selection criteria
coef(fit,c(0, 0.5, 2.5)) #extract coefficients at some values of t
predict(fit,X[1:10,]) #predict values at t selected by model selection criteria
predict(fit,X[1:10,],c(0, 0.5, 2.5)) #predict values at some values of t
plot(fit,criterion="cp") #plot the solution path with a model selected by Cp criterion

#elastic net
fit2 <- msgps(X,y,penalty="enet",alpha=0.5)
summary(fit2) 

#generalized elastic net
fit3 <- msgps(X,y,penalty="genet",alpha=0.5)
summary(fit3)

#adaptive lasso
fit4 <- msgps(X,y,penalty="alasso",gamma=1,lambda=0)
summary(fit4)

[Package msgps version 1.3.5 Index]