glmreg_fit {mpath} | R Documentation |
Internal function to fit a GLM with lasso (or elastic net), snet and mnet regularization
Description
Fit a generalized linear model via penalized maximum likelihood. The regularization path is computed for the lasso (or elastic net penalty), snet and mnet penalty, at a grid of values for the regularization parameter lambda. Fits linear, logistic, Poisson and negative binomial (fixed scale parameter) regression models.
Usage
glmreg_fit(x, y, weights, start=NULL, etastart=NULL, mustart=NULL, offset = NULL,
nlambda=100, lambda=NULL, lambda.min.ratio=ifelse(nobs<nvars,.05, .001),
alpha=1, gamma=3, rescale=TRUE, standardize=TRUE, intercept=TRUE,
penalty.factor = rep(1, nvars), thresh=1e-6, eps.bino=1e-5, maxit=1000,
eps=.Machine$double.eps, theta,
family=c("gaussian", "binomial", "poisson", "negbin"),
penalty=c("enet","mnet","snet"), convex=FALSE, x.keep=FALSE, y.keep=TRUE,
trace=FALSE)
Arguments
x |
input matrix, of dimension nobs x nvars; each row is an observation vector. |
y |
response variable. Quantitative for |
weights |
observation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation |
start |
starting values for the parameters in the linear predictor. |
etastart |
starting values for the linear predictor. |
mustart |
starting values for the vector of means. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. Currently only one offset term can be included in the formula. |
nlambda |
The number of |
lambda |
by default, the algorithm provides a sequence of regularization values, or a user supplied |
lambda.min.ratio |
Smallest value for |
alpha |
The |
gamma |
The tuning parameter of the |
rescale |
logical value, if TRUE, adaptive rescaling of the penalty parameter for |
standardize |
logical value for x variable standardization, prior to
fitting the model sequence. The coefficients are always returned on
the original scale. Default is |
intercept |
logical value: if TRUE (default), intercept(s) are fitted; otherwise, intercept(s) are set to zero |
penalty.factor |
This is a number that multiplies |
thresh |
Convergence threshold for coordinate descent. Defaults value is |
eps.bino |
a lower bound of probabilities to be truncated, for computing weights and related values when |
maxit |
Maximum number of coordinate descent iterations for each |
eps |
If a coefficient is less than |
convex |
Calculate index for which objective function ceases to
be locally convex? Default is FALSE and only useful if |
theta |
an overdispersion scaling parameter for |
family |
Response type (see above) |
penalty |
Type of regularization |
x.keep , y.keep |
For glmreg: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. For glmreg_fit: x is a design matrix of dimension n * p, and x is a vector of observations of length n. |
trace |
If |
Details
The sequence of models implied by lambda
is fit by coordinate
descent. For family="gaussian"
this is the lasso, mcp or scad sequence if
alpha=1
, else it is the enet, mnet or snet sequence.
For the other families, this is a lasso (mcp, scad) or elastic net (mnet, snet) regularization path
for fitting the generalized linear regression
paths, by maximizing the appropriate penalized log-likelihood.
Note that the objective function for "gaussian"
is
1/2*
weights*RSS + \lambda*penalty,
if standardize=FALSE
and
1/2*
\frac{weights}{\sum(weights)}*RSS + \lambda*penalty,
if standardize=TRUE
. For the other models it is
-\sum (weights * loglik) + \lambda*penalty
if standardize=FALSE
and
-\frac{weights}{\sum(weights)} * loglik + \lambda*penalty
if standardize=TRUE
.
Value
An object with S3 class "glmreg"
for the various types of models.
call |
the call that produced the model fit |
b0 |
Intercept sequence of length |
beta |
A |
lambda |
The actual sequence of |
satu |
satu=1 if a saturated model (deviance/null deviance < 0.05) is fit. Otherwise satu=0. The number of |
dev |
The computed deviance (for |
nulldev |
Null deviance (per observation). This is defined to be 2*(loglike_sat -loglike(Null)); The NULL model refers to the intercept model. |
nobs |
number of observations |
Author(s)
Zhu Wang <zwang145@uthsc.edu>
References
Breheny, P. and Huang, J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist., 5: 232-253.
Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]