R: Internal function to fit a GLM with lasso (or elastic net),...

glmreg_fit {mpath}

R Documentation

Internal function to fit a GLM with lasso (or elastic net), snet and mnet regularization

Description

Fit a generalized linear model via penalized maximum likelihood. The regularization path is computed for the lasso (or elastic net penalty), snet and mnet penalty, at a grid of values for the regularization parameter lambda. Fits linear, logistic, Poisson and negative binomial (fixed scale parameter) regression models.

Usage

glmreg_fit(x, y, weights, start=NULL, etastart=NULL, mustart=NULL, offset = NULL, 
           nlambda=100, lambda=NULL, lambda.min.ratio=ifelse(nobs<nvars,.05, .001), 
           alpha=1, gamma=3, rescale=TRUE, standardize=TRUE, intercept=TRUE, 
           penalty.factor = rep(1, nvars), thresh=1e-6, eps.bino=1e-5, maxit=1000, 
           eps=.Machine$double.eps, theta, 
           family=c("gaussian", "binomial", "poisson", "negbin"),
           penalty=c("enet","mnet","snet"), convex=FALSE, x.keep=FALSE, y.keep=TRUE,
           trace=FALSE)

Arguments

`x`	input matrix, of dimension nobs x nvars; each row is an observation vector.
`y`	response variable. Quantitative for `family="gaussian"`. Non-negative counts for `family="poisson"` or `family="negbin"`. For `family="binomial"` should be either a factor with two levels or a vector of proportions.
`weights`	observation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation
`start`	starting values for the parameters in the linear predictor.
`etastart`	starting values for the linear predictor.
`mustart`	starting values for the vector of means.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. Currently only one offset term can be included in the formula.
`nlambda`	The number of `lambda` values - default is 100. The sequence may be truncated before `nlambda` is reached if a close to saturated model is fitted. See also `satu`.
`lambda`	by default, the algorithm provides a sequence of regularization values, or a user supplied `lambda` sequence. When `alpha=0`, the largest lambda value is not defined (infinity). Thus, the largest `lambda` for `alpha=0.001` is computed, and the sequence of `lambda` values is calculated afterwards.
`lambda.min.ratio`	Smallest value for `lambda`, as a fraction of `lambda.max`, the (data derived) entry value (i.e. the smallest value for which all coefficients are zero except the intercept). Note, there is no closed formula for `lambda.max` in general. If `rescale=TRUE`, `lambda.max` is the same for `penalty="mnet" or "snet"`. Otherwise, some modifications are required. For instance, for small `gamma` value, half of the square root (if `lambda.max` is too small) of the computed `lambda.max` can be used when `penalty="mnet" or "snet"`. The default of `lambda.min.ratio` depends on the sample size `nobs` relative to the number of variables `nvars`. If `nobs > nvars`, the default is `0.001`, close to zero. If `nobs < nvars`, the default is `0.05`.
`alpha`	The `L_2` penalty mixing parameter, with `0 \le alpha\le 1`. `alpha=1` is lasso (mcp, scad) penalty; and `alpha=0` the ridge penalty. However, if `alpha=0`, one must provide `lambda` values.
`gamma`	The tuning parameter of the `snet` or `mnet` penalty.
`rescale`	logical value, if TRUE, adaptive rescaling of the penalty parameter for `penalty="mnet"` or `penalty="snet"` with `family` other than "gaussian". See reference
`standardize`	logical value for x variable standardization, prior to fitting the model sequence. The coefficients are always returned on the original scale. Default is `standardize=TRUE`.
`intercept`	logical value: if TRUE (default), intercept(s) are fitted; otherwise, intercept(s) are set to zero
`penalty.factor`	This is a number that multiplies `lambda` to allow differential shrinkage of coefficients. Can be 0 for some variables, which implies no shrinkage, and that variable is always included in the model. Default is same shrinkage for all variables.
`thresh`	Convergence threshold for coordinate descent. Defaults value is `1e-6`.
`eps.bino`	a lower bound of probabilities to be truncated, for computing weights and related values when `family="binomial"`. It is also used when `family="negbin"`.
`maxit`	Maximum number of coordinate descent iterations for each `lambda` value; default is 1000.
`eps`	If a coefficient is less than `eps` in magnitude, then it is reported to be 0
`convex`	Calculate index for which objective function ceases to be locally convex? Default is FALSE and only useful if `penalty="mnet" or "snet"`.
`theta`	an overdispersion scaling parameter for `family="negbin"`
`family`	Response type (see above)
`penalty`	Type of regularization
`x.keep`, `y.keep`	For glmreg: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For glmreg_fit: x is a design matrix of dimension n * p, and x is a vector of observations of length n.
`trace`	If `TRUE`, fitting progress is reported

Details

The sequence of models implied by lambda is fit by coordinate descent. For family="gaussian" this is the lasso, mcp or scad sequence if alpha=1, else it is the enet, mnet or snet sequence. For the other families, this is a lasso (mcp, scad) or elastic net (mnet, snet) regularization path for fitting the generalized linear regression paths, by maximizing the appropriate penalized log-likelihood. Note that the objective function for "gaussian" is

1/2* weights*RSS + \lambda*penalty,

if standardize=FALSE and

1/2* \frac{weights}{\sum(weights)}*RSS + \lambda*penalty,

if standardize=TRUE. For the other models it is

-\sum (weights * loglik) + \lambda*penalty

if standardize=FALSE and

-\frac{weights}{\sum(weights)} * loglik + \lambda*penalty

if standardize=TRUE.

Value

An object with S3 class "glmreg" for the various types of models.

`call`	the call that produced the model fit
`b0`	Intercept sequence of length `length(lambda)`
`beta`	A `nvars x length(lambda)` matrix of coefficients.
`lambda`	The actual sequence of `lambda` values used
`satu`	satu=1 if a saturated model (deviance/null deviance < 0.05) is fit. Otherwise satu=0. The number of `nlambda` sequence may be truncated before `nlambda` is reached if `satu=1`.
`dev`	The computed deviance (for `"gaussian"`, this is the R-square). The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat - loglike), where loglike_sat is the log-likelihood for the saturated model (a model with a free parameter per observation).
`nulldev`	Null deviance (per observation). This is defined to be 2*(loglike_sat -loglike(Null)); The NULL model refers to the intercept model.
`nobs`	number of observations

Author(s)

Zhu Wang <zwang145@uthsc.edu>

References

Breheny, P. and Huang, J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist., 5: 232-253.

Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]