R: Conditional logistic regression with elastic net penalties

clogitL1 {clogitL1}

R Documentation

Conditional logistic regression with elastic net penalties

Description

Fit a sequence of conditional logistic regression models with lasso or elastic net penalties

Usage

 clogitL1 (x, y, strata, numLambda=100, 
	minLambdaRatio=0.000001, switch=0, alpha = 1)

Arguments

`x`	matrix with rows equalling the number of observations. Contains the p-vector regressor values as rows
`y`	vector of binary responses with 1 for cases and 0 for controls.
`strata`	vector with stratum membership of each observation.
`numLambda`	number of different values of the regularisation parameter `\lambda` at which to compute parameter estimates. First fit is made at value just below smallest regularisation parameter value at which all parameter estimates are 0; last fit made at this value multipled by `minLambdaRatio`
`minLambdaRatio`	ratio of smallest to larget value of regularisation parameter `\lambda` at which we find parameter estimates.
`switch`	index (between 0 and `numLambda`) at which we transition from linear to logarithmic jumps.
`alpha`	parameter controling trade off between lasso and ridge penalties. At value 1, we have a pure lasso penalty; at 0, pure ridge. Intermediate values provide a mixture of the two.

Details

The sequence of models implied by numLambda and minLambdaRatio is fit by coordinate descent with warm starts and sequential strong rules. If alpha=1, we fit using a lasso penalty. Otherwise we fit with an elastic net penalty. Note that a pure ridge penalty is never obatined, because the function sets a floor for alpha at 0.000001. This improves the stability of the algorithm. A similar lower bound is set for minLambdaRatio. The sequence of models can be truncated at fewer than numLambda models if it is found that a very large proportion of training set deviance is explained by the model in question.

Value

An object of type clogitL1 with the following fields:

`beta`	(`numLambda` + 1)-by-p matrix of estimated coefficients. First row has all 0s
`lambda`	vector of length `numLambda` + 1 containing the value of the regularisation parameter at which we obtained the fits.
`nz_beta`	vector of length `numLambda` + 1 containing the number of nonzero parameter estimates for the fit at the corresponding regularisation parameter.
`ss_beta`	vector of length `numLambda` + 1 containing the number of predictors considered by the sequential strong rule at that iteration.
`dev_perc`	vector of length `numLambda` + 1 containing the percentage of null deviance explained by the model represented by that row in the matrix.
`y_c`	reordered vector of responses. Grouped by stratum with cases coming first.
`X_c`	reordered matrix of predictors. See above.
`strata_c`	reordered stratum vector. See above.
`nVec`	vector of length the number of unique strata in `strata` containing the number of observations encountered in each stratum.
`mVec`	vector containing the number of cases in each stratum.
`alpha`	penalty trade off parameter.

References

http://www.jstatsoft.org/v58/i12/

Examples


set.seed(145)
# data parameters
K = 10 # number of strata
n = 5 # number in strata
m = 2 # cases per stratum
p = 20 # predictors

# generate data
y = rep(c(rep(1, m), rep(0, n-m)), K)
X = matrix (rnorm(K*n*p, 0, 1), ncol = p) # pure noise
strata = sort(rep(1:K, n))

par(mfrow = c(1,2))
# fit the conditional logistic model
clObj = clogitL1(y=y, x=X, strata)
plot(clObj, logX=TRUE)

# cross validation
clcvObj = cv.clogitL1(clObj)
plot(clcvObj)

[Package clogitL1 version 1.5 Index]