R: cross-validation for the sparse DWD

cv.sdwd {sdwd}

R Documentation

cross-validation for the sparse DWD

Description

Conducts a k-fold cross-validation for sdwd and returns the suggested values of the L1 parameter lambda.

Usage

cv.sdwd(x, y, lambda = NULL, pred.loss = c("misclass", "loss"), nfolds = 5, foldid, ...)

Arguments

`x`	A matrix of predictors, i.e., the `x` matrix used in `sdwd`.
`y`	A vector of binary class labels, i.e., the `y` used in `sdwd`.
`lambda`	Default is `NULL`, and the sequence generated by `sdwd` is used. User can also provide a new `lambda` sequence to use in cross-validation.
`pred.loss`	`misclass` for classification error, `loss` for DWD loss.
`nfolds`	The number of folds. Default value is 5. The allowable range is from 3 to the sample size. Larger `nfolds` needs more timing.
`foldid`	An optional vector with values between 1 and `nfold`, representing the folder indices for each observation. If supplied, `nfold` can be missing.
`...`	Other arguments that can be passed to `sdwd`.

Details

This function runs sdwd to the sparse DWD by excluding every fold alternatively, and then computes the mean cross-validation error and the standard deviation. This function is modified based on the cv function from the gcdnet and the glmnet packages.

Value

A cv.sdwd object is returned, which includes the cross-validation fit.

`lambda`	The `lambda` sequence used in `sdwd`.
`cvm`	A vector of length `length(lambda)` for the mean cross-validated error.
`cvsd`	A vector of length `length(lambda)` for estimates of standard error of `cvm`.
`cvupper`	The upper curve: `cvm + cvsd`.
`cvlower`	The lower curve: `cvm - cvsd`.
`nzero`	Numbers of non-zero coefficients at each `lambda`.
`name`	“Mis-classification error", for plotting purposes.
`sdwd.fit`	A fitted `sdwd` object using the full data.
`lambda.min`	The `lambda` incurring the minimum cross validation error `cvm`.
`lambda.1se`	The largest value of `lambda` such that error is within one standard error of the minimum.
`cv.min`	The minimum cross-validation error.
`cv.1se`	The cross-validation error associated with `lambda.1se`.

Author(s)

Boxiang Wang and Hui Zou
Maintainer: Boxiang Wang boxiang-wang@uiowa.edu

References

Wang, B. and Zou, H. (2016) “Sparse Distance Weighted Discrimination", Journal of Computational and Graphical Statistics, 25(3), 826–838.
https://www.tandfonline.com/doi/full/10.1080/10618600.2015.1049700

Yang, Y. and Zou, H. (2013) “An Efficient Algorithm for Computing the HHSVM and Its Generalizations", Journal of Computational and Graphical Statistics, 22(2), 396–415.
https://www.tandfonline.com/doi/full/10.1080/10618600.2012.680324

Friedman, J., Hastie, T., and Tibshirani, R. (2010), "Regularization paths for generalized linear models via coordinate descent," Journal of Statistical Software, 33(1), 1–22.
https://www.jstatsoft.org/v33/i01/paper

Examples

data(colon)
colon$x = colon$x[ , 1:100] # this example only uses the first 100 columns 
n = nrow(colon$x)
set.seed(1)
id = sample(n, trunc(n/3))
cvfit = cv.sdwd(colon$x[-id, ], colon$y[-id], lambda2=1, nfolds=5)
plot(cvfit)
predict(cvfit, newx=colon$x[id, ], s="lambda.min")

[Package sdwd version 1.0.5 Index]