lasso_cv {adapt4pv}R Documentation

wrap function for cv.glmnet


Fit a first cross-validation on lasso regression and return selected covariates. Can deal with very large sparse data matrices. Intended for binary reponse only (option family = "binomial" is forced). Depends on the cv.glmnet function from the package glmnet.


lasso_cv(x, y, nfolds = 5, foldid = NULL, betaPos = TRUE, ...)



Input matrix, of dimension nobs x nvars. Each row is an observation vector. Can be in sparse matrix format (inherit from class "sparseMatrix" as in package Matrix).


Binary response variable, numeric.


Number of folds - default is 5. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3.


An optional vector of values between 1 and nfolds identifying what fold each observation is in. If supplied, nfolds can be missing.


Should the covariates selected by the procedure be positively associated with the outcome ? Default is TRUE.


Other arguments that can be passed to cv.glmnet from package glmnet other than nfolds, foldid, and family.


An object with S3 class "log.lasso".


Numeric vector of regression coefficients in the lasso. In lasso_cv function, the regression coefficients are PENALIZED. Length equal to nvars.


Character vector, names of variable(s) selected with the lasso-cv approach. If betaPos = TRUE, this set is the covariates with a positive regression coefficient in beta. Else this set is the covariates with a non null regression coefficient in beta. Covariates are ordering according to magnitude of their regression coefficients absolute value.


Emeline Courtois
Maintainer: Emeline Courtois


drugs <- matrix(rbinom(100*20, 1, 0.2), nrow = 100, ncol = 20)
colnames(drugs) <- paste0("drugs",1:ncol(drugs))
ae <- rbinom(100, 1, 0.3)
lcv <- lasso_cv(x = drugs, y = ae, nfolds = 3)

[Package adapt4pv version 0.2-1 Index]