R: Fits frequentist penalized nonparametric varying coefficient...

NVC_frequentist {NVCSSL}

R Documentation

Fits frequentist penalized nonparametric varying coefficient (NVC) models

Description

This function implements frequentist penalized nonparametric varying coefficient (NVC) models. It supports the following penalty functions: the group lasso penalty of Yuan and Lin (2006), the group minimax concave penalty (MCP) of Breheny and Huang (2015), and the group smoothly clipped absolute deviation (SCAD) penalty of Breheny and Huang (2015). This function solves a penalized regression problem of the form,

argmax_{\gamma} \frac{1}{N} \ell(\gamma) + pen_{\lambda}(\gamma),

where N is the total number of observations, \ell(\gamma) is the loss function, and pen_{\lambda}(\cdot) is a penalty function with regularization parameter \lambda > 0. Since the objective function is rescaled by 1/N, the penalty \lambda is typically smaller than the spike hyperparameter \lambda_0 used by the NVC_SSL function. The BIC criterion is used to select the optimal tuning parameter \lambda.

Usage

NVC_frequentist(y, t, X, n_basis=8, penalty=c("gLASSO","gSCAD","gMCP"),
                lambda=NULL, include_intercept=TRUE)

Arguments

`y`	`N \times 1` vector of response observations `y_{11}, ..., y_{1m_1}, ..., y_{n1}, ..., y_{nm_n}`
`t`	`N \times 1` vector of observation times `t_{11}, ..., t_{1m_1}, ..., t_{n1}, ..., t_{nm_n}`
`X`	`N \times p` design matrix with columns `[X_1, ..., X_p]`, where the `k`th column contains the entries `x_{ik}(t_{ij})`'s
`n_basis`	number of basis functions to use. Default is `n_basis=8`.
`penalty`	string specifying which penalty function to use. Specify `"gLASSO"` for group lasso, `"gSCAD"` for group SCAD, or `"gMCP"` for group MCP.
`lambda`	grid of tuning parameters. If `lambda` is not specified (i.e. `lambda=NULL`), then the program automatically chooses a grid for `lambda`. Note that since the objective function is scaled by `1/N`, the automatically chosen grid for `lambda` typically consists of smaller values than the default grid for `lambda0` used by the function `NVC_SSL`.
`include_intercept`	Boolean variable for whether or not to include an intercept function `\beta_0(t)` in the estimation. Default is `include_intercept=TRUE`.

Value

The function returns a list containing the following components:

`t_ordered`	all `N` time points ordered from smallest to largest. Needed for plotting.
`classifications`	`p \times 1` vector of indicator variables, where `"1"` indicates that the covariate is selected and `"0"` indicates that it is not selected. These classifications are determined by the optimal `lambda` chosen from BIC. Note that this vector does not include an intercept function.
`beta_hat`	`N \times p` matrix of the estimates for varying coefficient functions `\beta_k(t), k = 1, ..., p`, using the optimal `lambda` chosen from BIC. The `k`th column in the matrix is the `k`th estimated function at the observation times in `t_ordered`.
`beta0_hat`	estmate of the intercept function `\beta_0(t)` at the observation times in `t_ordered` for the optimal `lambda` chosen from BIC. This is not returned if `include_intercept = FALSE`.
`gamma_hat`	estimated basis coefficients (needed for prediction) for the optimal `lambda`.
`lambda_min`	the individual `lambda` which minimizes the BIC. If only one value was originally passed for `lambda`, then this just returns that `lambda`.
`lambda0_all`	grid of all `L` regularization parameters in `lambda`. Note that since the objective function is scaled by `1/N` for the penalized frequentist methods in the `NVC_frequentist` function, the `lambda_all` grid that is chosen automatically by `NVC_frequentist` typically consists of smaller values than the default values in the `lambda0_all` grid for `NVC_SSL`.
`BIC_all`	`L \times 1` vector of BIC values corresponding to all `L` entries in `lambda_all`. The `l`th entry corresponds to the `l`th entry in `lambda_all`.
`beta_est_all_lambda`	list of length `L` of the estimated varying coefficients `\beta_k(t), k = 1, ..., p`, corresponding to all `L` lambdas in `lambda_all`. The `l`th entry corresponds to the `l`th entry in `lambda_all`.
`beta0_est_all_lambda`	`N \times L` matrix of estimated intercept function `\beta_0(t)` corresponding to all `L` entries in `lambda_all`. The `l`th column corresponds to the `l`th entry in `lambda_all`. This is not returned if `include_intercept=FALSE`.
`gamma_est_all_lambda`	`dp \times L` matrix of estimated basis coefficients corresponding to all entries in `lambda_all`. The `l`th column corresponds to the `l`th entry in `lambda_all`.
`classifications_all_lambda`	`p \times L` matrix of classifications corresponding to all the entries in `lambda_all`. The `l`th column corresponds to the `l`th entry in `lambda_all`.
`iters_to_converge`	number of iterations it took for the group ascent algorithm to converge for each entry in `lambda_all`. The `l`th entry corresponds to the `l`th entry in `lambda_all`.

References

Bai, R., Boland, M. R., and Chen, Y. (2023). "Scalable high-dimensional Bayesian varying coefficient models with unknown within-subject covariance." arXiv pre-print arXiv:arXiv:1907.06477.

Breheny, P. and Huang, J. (2015). "Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors." Statistics and Computing, 25:173-187.

Wei, F., Huang, J., and Li, H. (2011). "Variable selection and estimation in high-dimensional varying coefficient models." Statistica Sinica, 21:1515-1540.

Yuan, M. and Lin, Y. (2006). "Model selection and estimation in regression with grouped variables." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68:49-67.

Examples

## Load data
data(SimulatedData)
attach(SimulatedData)
y = SimulatedData$y
t = SimulatedData$t
id = SimulatedData$id
X = SimulatedData[,4:103]

## Fit frequentist penalized NVC model with the SCAD penalty. 
## Can set penalty as "gLASSO", "gSCAD", or "gMCP". 
## No need to specify an 'id' argument when using NVC_frequentist() function

NVC_gSCAD_mod = NVC_frequentist(y, t, X, penalty="gSCAD")

## Classifications. First varying coefficients are selected as nonzero
NVC_gSCAD_mod$classifications

## Optimal lambda chosen from BIC
NVC_gSCAD_mod$lambda_min

## Plot first estimated varying coefficient function
t_ordered = NVC_gSCAD_mod$t_ordered
beta_hat= NVC_gSCAD_mod$beta_hat
plot(t_ordered, beta_hat[,1], lwd=3, type='l', col='blue',
     xlab="Time", ylim = c(-12,12), ylab=expression(beta[1]))

## Plot third estimated varying coefficient function
plot(t_ordered, beta_hat[,3], lwd=3, type='l', col='blue',
     xlab="Time", ylim = c(-4,2), ylab=expression(beta[3]))

## Plot fifth estimated varying coefficient function
plot(t_ordered, beta_hat[,5], lwd=3, type='l', col='blue',
     xlab="Time", ylim = c(0,15), ylab=expression(beta[5]))

[Package NVCSSL version 2.0 Index]