glmreg_fit {mpath}R Documentation

Internal function to fit a GLM with lasso (or elastic net), snet and mnet regularization

Description

Fit a generalized linear model via penalized maximum likelihood. The regularization path is computed for the lasso (or elastic net penalty), snet and mnet penalty, at a grid of values for the regularization parameter lambda. Fits linear, logistic, Poisson and negative binomial (fixed scale parameter) regression models.

Usage

glmreg_fit(x, y, weights, start=NULL, etastart=NULL, mustart=NULL, offset = NULL, 
           nlambda=100, lambda=NULL, lambda.min.ratio=ifelse(nobs<nvars,.05, .001), 
           alpha=1, gamma=3, rescale=TRUE, standardize=TRUE, intercept=TRUE, 
           penalty.factor = rep(1, nvars), thresh=1e-6, eps.bino=1e-5, maxit=1000, 
           eps=.Machine$double.eps, theta, 
           family=c("gaussian", "binomial", "poisson", "negbin"),
           penalty=c("enet","mnet","snet"), convex=FALSE, x.keep=FALSE, y.keep=TRUE,
           trace=FALSE)

Arguments

x

input matrix, of dimension nobs x nvars; each row is an observation vector.

y

response variable. Quantitative for family="gaussian". Non-negative counts for family="poisson" or family="negbin". For family="binomial" should be either a factor with two levels or a vector of proportions.

weights

observation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation

start

starting values for the parameters in the linear predictor.

etastart

starting values for the linear predictor.

mustart

starting values for the vector of means.

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. Currently only one offset term can be included in the formula.

nlambda

The number of lambda values - default is 100. The sequence may be truncated before nlambda is reached if a close to saturated model is fitted. See also satu.

lambda

by default, the algorithm provides a sequence of regularization values, or a user supplied lambda sequence. When alpha=0, the largest lambda value is not defined (infinity). Thus, the largest lambda for alpha=0.001 is computed, and the sequence of lambda values is calculated afterwards.

lambda.min.ratio

Smallest value for lambda, as a fraction of lambda.max, the (data derived) entry value (i.e. the smallest value for which all coefficients are zero except the intercept). Note, there is no closed formula for lambda.max in general. If rescale=TRUE, lambda.max is the same for penalty="mnet" or "snet". Otherwise, some modifications are required. For instance, for small gamma value, half of the square root (if lambda.max is too small) of the computed lambda.max can be used when penalty="mnet" or "snet". The default of lambda.min.ratio depends on the sample size nobs relative to the number of variables nvars. If nobs > nvars, the default is 0.001, close to zero. If nobs < nvars, the default is 0.05.

alpha

The L_2 penalty mixing parameter, with 0 \le alpha\le 1. alpha=1 is lasso (mcp, scad) penalty; and alpha=0 the ridge penalty. However, if alpha=0, one must provide lambda values.

gamma

The tuning parameter of the snet or mnet penalty.

rescale

logical value, if TRUE, adaptive rescaling of the penalty parameter for penalty="mnet" or penalty="snet" with family other than "gaussian". See reference

standardize

logical value for x variable standardization, prior to fitting the model sequence. The coefficients are always returned on the original scale. Default is standardize=TRUE.

intercept

logical value: if TRUE (default), intercept(s) are fitted; otherwise, intercept(s) are set to zero

penalty.factor

This is a number that multiplies lambda to allow differential shrinkage of coefficients. Can be 0 for some variables, which implies no shrinkage, and that variable is always included in the model. Default is same shrinkage for all variables.

thresh

Convergence threshold for coordinate descent. Defaults value is 1e-6.

eps.bino

a lower bound of probabilities to be truncated, for computing weights and related values when family="binomial". It is also used when family="negbin".

maxit

Maximum number of coordinate descent iterations for each lambda value; default is 1000.

eps

If a coefficient is less than eps in magnitude, then it is reported to be 0

convex

Calculate index for which objective function ceases to be locally convex? Default is FALSE and only useful if penalty="mnet" or "snet".

theta

an overdispersion scaling parameter for family="negbin"

family

Response type (see above)

penalty

Type of regularization

x.keep, y.keep

For glmreg: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For glmreg_fit: x is a design matrix of dimension n * p, and x is a vector of observations of length n.

trace

If TRUE, fitting progress is reported

Details

The sequence of models implied by lambda is fit by coordinate descent. For family="gaussian" this is the lasso, mcp or scad sequence if alpha=1, else it is the enet, mnet or snet sequence. For the other families, this is a lasso (mcp, scad) or elastic net (mnet, snet) regularization path for fitting the generalized linear regression paths, by maximizing the appropriate penalized log-likelihood. Note that the objective function for "gaussian" is

1/2* weights*RSS + \lambda*penalty,

if standardize=FALSE and

1/2* \frac{weights}{\sum(weights)}*RSS + \lambda*penalty,

if standardize=TRUE. For the other models it is

-\sum (weights * loglik) + \lambda*penalty

if standardize=FALSE and

-\frac{weights}{\sum(weights)} * loglik + \lambda*penalty

if standardize=TRUE.

Value

An object with S3 class "glmreg" for the various types of models.

call

the call that produced the model fit

b0

Intercept sequence of length length(lambda)

beta

A nvars x length(lambda) matrix of coefficients.

lambda

The actual sequence of lambda values used

satu

satu=1 if a saturated model (deviance/null deviance < 0.05) is fit. Otherwise satu=0. The number of nlambda sequence may be truncated before nlambda is reached if satu=1.

dev

The computed deviance (for "gaussian", this is the R-square). The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat - loglike), where loglike_sat is the log-likelihood for the saturated model (a model with a free parameter per observation).

nulldev

Null deviance (per observation). This is defined to be 2*(loglike_sat -loglike(Null)); The NULL model refers to the intercept model.

nobs

number of observations

Author(s)

Zhu Wang <zwang145@uthsc.edu>

References

Breheny, P. and Huang, J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist., 5: 232-253.

Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]

See Also

glmreg


[Package mpath version 0.4-2.26 Index]