lasso_bic {adapt4pv} | R Documentation |
fit a lasso regression and use standard BIC for variable selection
Description
Fit a lasso regression and use the Bayesian Information Criterion (BIC)
to select a subset of selected covariates.
Can deal with very large sparse data matrices.
Intended for binary reponse only (option family = "binomial"
is forced).
Depends on the glmnet
and relax.glmnet
functions from the package glmnet
.
Usage
lasso_bic(x, y, maxp = 50, path = TRUE, betaPos = TRUE, ...)
Arguments
x |
Input matrix, of dimension nobs x nvars. Each row is an observation
vector. Can be in sparse matrix format (inherit from class
|
y |
Binary response variable, numeric. |
maxp |
A limit on how many relaxed coefficients are allowed.
Default is 50, in |
path |
Since |
betaPos |
Should the covariates selected by the procedure be
positively associated with the outcome ? Default is |
... |
Other arguments that can be passed to |
Details
For each tested penalisation parameter , a standard version of the BIC
is implemented.
where is the log-likelihood of the non-penalized multiple logistic
regression model that includes the set of covariates with a non-zero coefficient
in the penalised regression coefficient vector associated to
,
and
is the number of covariates with a non-zero coefficient
in the penalised regression coefficient vector associated to
,
The optimal set of covariates according to this approach is the one associated with
the classical multiple logistic regression model which minimizes the BIC.
Value
An object with S3 class "log.lasso"
.
beta |
Numeric vector of regression coefficients in the lasso.
In |
selected_variables |
Character vector, names of variable(s) selected with the
lasso-bic approach.
If |
Author(s)
Emeline Courtois
Maintainer: Emeline Courtois
emeline.courtois@inserm.fr
Examples
set.seed(15)
drugs <- matrix(rbinom(100*20, 1, 0.2), nrow = 100, ncol = 20)
colnames(drugs) <- paste0("drugs",1:ncol(drugs))
ae <- rbinom(100, 1, 0.3)
lb <- lasso_bic(x = drugs, y = ae, maxp = 20)