R: Regularied M-estimation for fitting generalized linear models...

glm.regu.cv {RCAL}

R Documentation

Regularied M-estimation for fitting generalized linear models based on cross validation

Description

This function implements regularized M-estimation for fitting generalized linear models with binary or contiunous responses based on cross validation.

Usage

glm.regu.cv(fold, nrho = NULL, rho.seq = NULL, y, x, iw = NULL,
  loss = "cal", n.iter = 100, eps = 1e-06, tune.fac = 0.5,
  tune.cut = TRUE, ann.init = TRUE, nz.lab = NULL, permut = NULL)

Arguments

`fold`	A fold number used for cross validation.
`nrho`	The number of tuning parameters searched in cross validation.
`rho.seq`	A vector of tuning parameters searched in cross validation. If both `nrho` and `rho.seq` are specified, then `rho.seq` overrides `nrho`.
`y`	An `n` x `1` response vector.
`x`	An `n` x `p` matix of covariates, excluding a constant.
`iw`	An `n` x `1` weight vector.
`loss`	A loss function, which can be specified as "gaus" for continuous responses, or "ml" or "cal" for binary respones.
`n.iter`	The maximum number of iterations allowed as in `glm.regu`.
`eps`	The tolerance used to declare convergence as in `glm.regu`.
`tune.fac`	The multiplier (factor) used to define `rho.seq` if only `nrho` is specified.
`tune.cut`	Logical; if `TRUE`, all smaller tuning parameters are skipped once non-convergence is found with a tuning parameter.
`ann.init`	Logical; if `TRUE`, the estimates from the previous tuning parameter are used as the inital values when fitting with the current tuning parameter.
`nz.lab`	A `p` x `1` logical vector (useful for simulations), indicating which covariates are included when calculating the number of nonzero coefficients.
`permut`	An `n` x `1` vector, giving a random permutation of the integers from 1 to `n`, which is used in cross validation.

Details

Cross validation is performed as described in Tan (2020a, 2020b). If not specified by users, the sequence of tuning parameters searched is defined as a geometric series of length nrho, starting from the value which yields a zero solution, and then decreasing by a factor tune.fac successively.

After cross validation, two tuning parameters are selected. The first and default choice is the value yielding the smallest average test loss. The second choice is the largest value giving the average test loss within one standard error of the first choice (Hastie, Tibshirani, and Friedman 2016).

Value

`permut`	An `n` x `1` vector, giving the random permutation used in cross validation.
`rho`	The vector of tuning parameters, searched in cross validation.
`non.conv`	A vector indicating the non-convergene status found or imputed if `tune.cut=TRUE`, for the tuning parmaters in cross validation. For each tuning parameter, 0 indicates convergence, 1 non-convergence if exceeding `n.iter`, 2 non-convergence if exceeding `bt.lim`.
`err.ave`	A vector giving the averages of the test losses in cross validation.
`err.sd`	A vector giving the standard deviations of the test losses in cross validation.
`sel.rho`	A vector of two selected tuning parameters by cross validation; see Details.
`sel.nz`	A vector of numbers of nonzero coefficients estimated for the selected tuning parameters.
`sel.bet`	The `(p+1)` x `2` vector of estimated intercept and coefficients.
`sel.fit`	The `n` x `2` vector of fitted values.

References

Hastie, T., Tibshirani, R., and Friedman. J. (2016) The Elements of Statistical Learning (second edition), Springer: New York.

Tan, Z. (2020a) Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data, Biometrika, 107, 137–158.

Tan, Z. (2020b) Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data, Annals of Statistics, 48, 811–837.

Examples


data(simu.data)
n <- dim(simu.data)[1]
p <- dim(simu.data)[2]-2

y <- simu.data[,1]
tr <- simu.data[,2]
x <- simu.data[,2+1:p]
x <- scale(x)

### Example 1: Regularized maximum likelihood estimation of propensity scores
ps.cv.rml <- glm.regu.cv(fold=5, nrho=1+10, y=tr, x=x, loss="ml")
ps.cv.rml$rho
ps.cv.rml$err.ave
ps.cv.rml$err.sd
ps.cv.rml$sel.rho
ps.cv.rml$sel.nz

fp.cv.rml <- ps.cv.rml $sel.fit[,1]
check.cv.rml <- mn.ipw(x, tr, fp.cv.rml)
check.cv.rml$est

### Example 2: Regularized calibrated estimation of propensity scores
ps.cv.rcal <- glm.regu.cv(fold=5, nrho=1+10, y=tr, x=x, loss="cal")
ps.cv.rcal$rho
ps.cv.rcal$err.ave
ps.cv.rcal$err.sd
ps.cv.rcal$sel.rho
ps.cv.rcal$sel.nz

fp.cv.rcal <- ps.cv.rcal $sel.fit[,1]

check.cv.rcal <- mn.ipw(x, tr, fp.cv.rcal)
check.cv.rcal$est

[Package RCAL version 2.0 Index]