xrnet {xrnet} | R Documentation |
Fit hierarchical regularized regression model
Description
Fits hierarchical regularized regression model that enables the incorporation of external data for predictor variables. Both the predictor variables and external data can be regularized by the most common penalties (lasso, ridge, elastic net). Solutions are computed across a two-dimensional grid of penalties (a separate penalty path is computed for the predictors and external variables). Currently support regularized linear and logistic regression, future extensions to other outcomes (i.e. Cox regression) will be implemented in the next major update.
Usage
xrnet(
x,
y,
external = NULL,
unpen = NULL,
family = c("gaussian", "binomial"),
penalty_main = define_penalty(),
penalty_external = define_penalty(),
weights = NULL,
standardize = c(TRUE, TRUE),
intercept = c(TRUE, FALSE),
control = list()
)
Arguments
x |
predictor design matrix of dimension
|
y |
outcome vector of length |
external |
(optional) external data design matrix of dimension
|
unpen |
(optional) unpenalized predictor design matrix, matrix options include:
|
family |
error distribution for outcome variable, options include:
|
penalty_main |
specifies regularization object for x. See
|
penalty_external |
specifies regularization object for external. See
|
weights |
optional vector of observation-specific weights. Default is 1 for all observations. |
standardize |
indicates whether x and/or external should be standardized. Default is c(TRUE, TRUE). |
intercept |
indicates whether an intercept term is included for x and/or external. Default is c(TRUE, FALSE). |
control |
specifies xrnet control object. See
|
Details
This function extends the coordinate descent algorithm of the
R package glmnet
to allow the type of regularization (i.e. ridge,
lasso) to be feature-specific. This extension is used to enable fitting
hierarchical regularized regression models, where external information for
the predictors can be included in the external=
argument. In addition,
elements of the R package biglasso
are utilized to enable the use of
standard R matrices, memory-mapped matrices from the bigmemory
package, or sparse matrices from the Matrix
package.
Value
A list of class xrnet
with components:
beta0 |
matrix of first-level intercepts indexed by penalty values |
betas |
3-dimensional array of first-level penalized coefficients indexed by penalty values |
gammas |
3-dimensional array of first-level non-penalized coefficients indexed by penalty values |
alpha0 |
matrix of second-level intercepts indexed by penalty values |
alphas |
3-dimensional array of second-level external data coefficients indexed by penalty values |
penalty |
vector of first-level penalty values |
penalty_ext |
vector of second-level penalty values |
family |
error distribution for outcome variable |
num_passes |
total number of passes over the data in the coordinate descent algorithm |
status |
error status for xrnet fitting |
0 = OK
1 = Error/Warning
error_msg |
description of error |
References
Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL http://www.jstatsoft.org/v33/i01/.
Zeng, Y., and Breheny, P. (2017). The biglasso Package: A Memory- and Computation-Efficient Solver for Lasso Model Fitting with Big Data in R. arXiv preprint arXiv:1701.05936. URL https://arxiv.org/abs/1701.05936.
Michael J. Kane, John Emerson, Stephen Weston (2013). Scalable Strategies for Computing with Massive Data. Journal of Statistical Software, 55(14), 1-19. URL http://www.jstatsoft.org/v55/i14/.
Examples
### hierarchical regularized linear regression ###
data(GaussianExample)
## define penalty for predictors and external variables
## default is ridge for predictors and lasso for external
## see define_penalty() function for more details
penMain <- define_penalty(0, num_penalty = 20)
penExt <- define_penalty(1, num_penalty = 20)
## fit model with defined regularization
fit_xrnet <- xrnet(
x = x_linear,
y = y_linear,
external = ext_linear,
family = "gaussian",
penalty_main = penMain,
penalty_external = penExt
)