sdwd {sdwd} | R Documentation |
fit the sparse DWD
Description
Fits the sparse distance weighted discrimination (SDWD) model with imposing L1, elastic-net, or adaptive elastic-net penalties. The solution path is computed at a grid of values of tuning parameter lambda
. This function is modified based on the glmnet
and the gcdnet
packages.
Usage
sdwd(x, y, nlambda=100,
lambda.factor=ifelse(nobs < nvars, 0.01, 1e-04),
lambda=NULL, lambda2=0, pf=rep(1, nvars),
pf2=rep(1, nvars), exclude, dfmax=nvars + 1,
pmax=min(dfmax * 1.2, nvars), standardize=TRUE,
eps=1e-8, maxit=1e6, strong=TRUE)
Arguments
x |
A matrix with |
y |
A vector of length |
nlambda |
The number of |
lambda.factor |
The ratio of the smallest to the largest |
lambda |
An optional user-supplied |
lambda2 |
The L2 tuning parameter |
pf |
A vector of length |
pf2 |
A vector of length |
exclude |
Whether to exclude some predictors from the model. This is equivalent to adopting an infinite penalty factor when excluding some predictor. Default is none. |
dfmax |
Restricts at most how many predictors can be incorporated in the model. Default is |
pmax |
Restricts the maximum number of variables ever to be nonzero; e.g, once some |
standardize |
Whether to standardize the data. If |
eps |
The algorithm stops when (i.e. |
maxit |
Restricts how many outer-loop iterations are allowed. Default is 1e6. Consider increasing |
strong |
If |
Details
The sdwd
minimizes the sparse penalized DWD loss function,
L(y, X, \beta)/N + \lambda_1||\beta||_1 + 0.5\lambda_2||\beta||_2^2,
where L(u)=1-u
if u \le 1/2
, 1/(4u)
if u > 1/2
is the DWD loss. The value of lambda2
is user-specified.
To use the L1 penalty (lasso), set lambda2=0
. To use the elastic net, set lambda2
as nonzero. To use the adaptive L1, set lambda2=0
and specify pf
and pf2
. To use the adaptive elastic net, set lambda2
as nonzero and specify pf
and pf2
as well.
When the algorithm do not converge or run slow, consider increasing eps
, decreasing
nlambda
, or increasing lambda.factor
before increasing
maxit
.
Value
An object with S3 class sdwd
.
b0 |
A vector of length |
beta |
A matrix of dimension |
df |
The number of nonzero coefficients at each |
dim |
The dimension of coefficient matrix, i.e., |
lambda |
The |
npasses |
Total number of iterations for all lambda values. |
jerr |
Warnings and errors; 0 if no error. |
call |
The call that produced this object. |
Author(s)
Boxiang Wang and Hui Zou
Maintainer: Boxiang Wang boxiang-wang@uiowa.edu
References
Wang, B. and Zou, H. (2016)
“Sparse Distance Weighted Discrimination", Journal of Computational and Graphical Statistics, 25(3), 826–838.
https://www.tandfonline.com/doi/full/10.1080/10618600.2015.1049700
Friedman, J., Hastie, T., and Tibshirani, R. (2010), "Regularization paths for generalized
linear models via coordinate descent", Journal of Statistical Software, 33(1), 1–22.
https://www.jstatsoft.org/v33/i01/paper
Marron, J.S., Todd, M.J., and Ahn, J. (2007)
“Distance-Weighted Discrimination",
Journal of the American Statistical Association, 102(408), 1267–1271.
https://www.tandfonline.com/doi/abs/10.1198/016214507000001120
Tibshirani, Robert., Bien, J., Friedman, J.,Hastie, T.,Simon,
N.,Taylor, J., and Tibshirani, Ryan. (2012)
Strong Rules for Discarding Predictors in Lasso-type Problems,
Journal of the Royal Statistical Society, Series B, 74(2), 245–266.
https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9868.2011.01004.x
Yang, Y. and Zou, H. (2013)
“An Efficient Algorithm for Computing the HHSVM and Its Generalizations",
Journal of Computational and Graphical Statistics, 22(2), 396–415.
https://www.tandfonline.com/doi/full/10.1080/10618600.2012.680324
See Also
print.sdwd
, predict.sdwd
, coef.sdwd
, plot.sdwd
, and cv.sdwd
.
Examples
# load the data
data(colon)
# fit the elastic-net penalized DWD with lambda2=1
fit = sdwd(colon$x, colon$y, lambda2=1)
print(fit)
# coefficients at some lambda value
c1 = coef(fit, s=0.005)
# make predictions
predict(fit, newx=colon$x[1:10, ], s=c(0.01, 0.005))