adapt_bic {adapt4pv}R Documentation

fit an adaptive lasso with adaptive weights derived from lasso-bic

Description

Fit a first lasso regression and use Bayesian Information Criterion to determine ' adaptive weights (see lasso_bic function for more details), then run an adaptive lasso with this penalty weighting. BIC is used for the adaptive lasso for variable selection. Can deal with very large sparse data matrices. Intended for binary reponse only (option family = "binomial" is forced). Depends on the glmnet and relax.glmnet function from the package glmnet.

Usage

adapt_bic(x, y, gamma = 1, maxp = 50, path = TRUE, betaPos = TRUE, ...)

Arguments

x

Input matrix, of dimension nobs x nvars. Each row is an observation vector. Can be in sparse matrix format (inherit from class "sparseMatrix" as in package Matrix).

y

Binary response variable, numeric.

gamma

Tunning parameter to defined the penalty weights. See details below. Default is set to 1.

maxp

A limit on how many relaxed coefficients are allowed. Default is 50, in glmnet option default is 'n-3', where 'n' is the sample size.

path

Since glmnet does not do stepsize optimization, the Newton algorithm can get stuck and not converge, especially with relaxed fits. With path=TRUE, each relaxed fit on a particular set of variables is computed pathwise using the original sequence of lambda values (with a zero attached to the end). Default is path=TRUE.

betaPos

Should the covariates selected by the procedure be positively associated with the outcome ? Default is TRUE.

...

Other arguments that can be passed to glmnet from package glmnet other than penalty.factor, family, maxp and path.

Details

The adaptive weight for a given covariate i is defined by

w_i = 1/|β^{BIC}_i|^γ

where β^{BIC}_i is the NON PENALIZED regression coefficient associated to covariate i obtained with lasso-bic.

Value

An object with S3 class "adaptive".

aws

Numeric vector of penalty weights derived from lasso-bic. Length equal to nvars.

criterion

Character, indicates which criterion is used with the adaptive lasso for variable selection. For adapt_bic function, criterion is "bic".

beta

Numeric vector of regression coefficients in the adaptive lasso. If criterion = "cv" the regression coefficients are PENALIZED, if criterion = "bic" the regression coefficients are UNPENALIZED. Length equal to nvars. Could be NA if adaptive weights are all equal to infinity.

selected_variables

Character vector, names of variable(s) selected with this adaptive approach. If betaPos = TRUE, this set is the covariates with a positive regression coefficient in beta. Else this set is the covariates with a non null regression coefficient in beta. Covariates are ordering according to the p-values (two-sided if betaPos = FALSE , one-sided if betaPos = TRUE) in the classical multiple logistic regression model that minimzes the BIC in the adaptive lasso.

Author(s)

Emeline Courtois
Maintainer: Emeline Courtois emeline.courtois@inserm.fr

Examples


set.seed(15)
drugs <- matrix(rbinom(100*20, 1, 0.2), nrow = 100, ncol = 20)
colnames(drugs) <- paste0("drugs",1:ncol(drugs))
ae <- rbinom(100, 1, 0.3)
ab <- adapt_bic(x = drugs, y = ae, maxp = 50)



[Package adapt4pv version 0.2-1 Index]