glm.regu {RCAL} | R Documentation |
Regularied M-estimation for fitting generalized linear models with a fixed tuning parameter
Description
This function implements regularized M-estimation for fitting generalized linear models with continuous or binary responses for a fixed choice of tuning parameters.
Usage
glm.regu(y, x, iw = NULL, loss = "cal", init = NULL, rhos, test = NULL,
offs = NULL, id = NULL, Wmat = NULL, Rmat = NULL, zzs = NULL,
xxs = NULL, n.iter = 100, eps = 1e-06, bt.lim = 3, nz.lab = NULL,
pos = 10000)
Arguments
y |
An |
x |
An |
iw |
An |
loss |
A loss function, which can be specified as "gaus" for continuous responses, or "ml" or "cal" for binary respones. |
init |
A |
rhos |
A |
test |
A vector giving the indices of observations between 1 and |
offs |
An |
id |
An argument which can be used to speed up computation. |
Wmat |
An argument which can be used to speed up computation. |
Rmat |
An argument which can be used to speed up computation. |
zzs |
An argument which can be used to speed up computation. |
xxs |
An argument which can be used to speed up computation. |
n.iter |
The maximum number of iterations allowed. An iteration is defined by computing an quadratic approximation and solving a least-squares Lasso problem. |
eps |
The tolerance at which the difference in the objective (loss plus penalty) values is considered close enough to 0 to declare convergence. |
bt.lim |
The maximum number of backtracking steps allowed. |
nz.lab |
A |
pos |
A value which can be used to facilitate recording the numbers of nonzero coefficients with or without the restriction by |
Details
For continuous responses, this function uses an active-set descent algorithm (Osborne et al. 2000; Yang and Tan 2018) to solve the least-squares Lasso problem. For binary responses, regularized calibrated estimation is implemented using the Fisher scoring descent algorithm in Tan (2020), whereas regularized maximum likelihood estimation is implemented in a similar manner based on quadratic approximation as in the R package glmnet.
Value
iter |
The number of iterations performed up to |
conv |
1 if convergence is obtained, 0 if exceeding the maximum number of iterations, or -1 if exceeding maximum number of backtracking steps. |
nz |
A value defined as (nz0 * |
inter |
The estimated intercept. |
bet |
The |
fit |
The vector of fitted values in the training set. |
eta |
The vector of linear predictors in the training set. |
tau |
The |
obj.train |
The average loss in the training set. |
pen |
The Lasso penalty of the estimates. |
obj |
The average loss plus the Lasso penalty. |
fit.test |
The vector of fitted values in the test set. |
eta.test |
The vector of linear predictors in the test set. |
obj.test |
The average loss in the test set. |
id |
This can be re-used to speed up computation. |
Wmat |
This can be re-used to speed up computation. |
Rmat |
This can be re-used to speed up computation. |
zzs |
This can be re-used to speed up computation. |
xxs |
This can be re-used to speed up computation. |
References
Osborne, M., Presnell, B., and Turlach, B. (2000) A new approach to variable selection in least squares problems, IMA Journal of Numerical Analysis, 20, 389-404.
Yang, T. and Tan, Z. (2018) Backfitting algorithms for total-variation and empirical-norm penalized additive modeling with high-dimensional data, Stat, 7, e198.
Tibshirani, R. (1996) Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society, Ser. B, 58, 267-288.
Tan, Z. (2020) Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data, Biometrika, 107, 137–158.
Examples
data(simu.data)
n <- dim(simu.data)[1]
p <- dim(simu.data)[2]-2
y <- simu.data[,1]
tr <- simu.data[,2]
x <- simu.data[,2+1:p]
x <- scale(x)
### Example 1: linear regression
# rhos should be a vector of length p, even though a constant vector
out.rgaus <- glm.regu(y[tr==1], x[tr==1,], rhos=rep(.05,p), loss="gaus")
# the intercept
out.rgaus$inter
# the estimated coefficients and generalized signs; the first 10 are shown
cbind(out.rgaus$bet, out.rgaus$tau)[1:10,]
# the number of nonzero coefficients
out.rgaus$nz
### Example 2: logistic regression using likelihood loss
out.rml <- glm.regu(tr, x, rhos=rep(.01,p), loss="ml")
out.rml$inter
cbind(out.rml$bet, out.rml$tau)[1:10,]
out.rml$nz
### Example 3: logistic regression using calibration loss
out.rcal <- glm.regu(tr, x, rhos=rep(.05,p), loss="cal")
out.rcal$inter
cbind(out.rcal$bet, out.rcal$tau)[1:10,]
out.rcal$nz