cisl {adapt4pv} | R Documentation |
Class Imbalanced Subsampling Lasso
Description
Implementation of CISL and the stability selection according to subsampling options.
Usage
cisl(
x,
y,
r = 4,
nB = 100,
dfmax = 50,
nlambda = 250,
nMin = 0,
replace = TRUE,
betaPos = TRUE,
ncore = 1
)
Arguments
x |
Input matrix, of dimension nobs x nvars. Each row is an
observation vector. Can be in sparse matrix format (inherit from class
|
y |
Binary response variable, numeric. |
r |
Number of control in the CISL sampling. Default is 4. See details below for other implementations. |
nB |
Number of sub-samples. Default is 100. |
dfmax |
Corresponds to the maximum size of the models visited with the lasso (E in the paper). Default is 50. |
nlambda |
Number of lambda values as is |
nMin |
Minimum number of events for a covariate to be considered.
Default is 0, all the covariates from |
replace |
Should sampling be with replacement? Default is TRUE. |
betaPos |
If |
ncore |
The number of calcul units used for parallel computing.
This has to be set to 1 if the |
Details
CISL is a variation of the stability method adapted to characteristics of pharmacovigilance databases.
Tunning r = 4
and replace = TRUE
are used to implement our CISL sampling.
For instance, r = NULL
and replace = FALSE
can be used to
implement the n \over 2
sampling in Stability Selection.
Value
An object with S3 class "cisl"
.
prob |
Matrix of dimension nvars x |
q05 |
5 |
q10 |
10 |
q15 |
15 |
q20 |
20 |
Author(s)
Ismail Ahmed
References
Ahmed, I., Pariente, A., & Tubert-Bitter, P. (2018). "Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions". Statistical Methods in Medical Research. 27(3), 785–797, doi:10.1177/0962280216643116
Examples
set.seed(15)
drugs <- matrix(rbinom(100*20, 1, 0.2), nrow = 100, ncol = 20)
colnames(drugs) <- paste0("drugs",1:ncol(drugs))
ae <- rbinom(100, 1, 0.3)
lcisl <- cisl(x = drugs, y = ae, nB = 50)