R: Split Regularized Regression algorithm with a sparsity and...

cv.SplitReg {SplitReg}

R Documentation

Split Regularized Regression algorithm with a sparsity and diversity penalty.

Description

Computes a split regularized regression estimator. The sparsity and diversity penalty parameters are chosen automatically.

Usage

cv.SplitReg(
  x,
  y,
  num_lambdas_sparsity = 100,
  num_lambdas_diversity = 100,
  alpha = 1,
  num_models = 10,
  tolerance = 1e-08,
  max_iter = 1e+05,
  num_folds = 10,
  num_threads = 1
)

Arguments

`x`	Design matrix.
`y`	Response vector.
`num_lambdas_sparsity`	Length of the grid of sparsity penalties.
`num_lambdas_diversity`	Length of the grid of diversity penalties.
`alpha`	Elastic Net tuning constant: the value must be between 0 and 1. Default is 1 (Lasso).
`num_models`	Number of models to build.
`tolerance`	Tolerance parameter to stop the iterations while cycling over the models.
`max_iter`	Maximum number of iterations before stopping the iterations while cycling over the models.
`num_folds`	Number of folds for cross-validating.
`num_threads`	Number of threads used for parallel computation over the folds.

Details

Computes a split regularized regression estimator with num_models (G) models, defined as the linear models \boldsymbol{\beta}^{1},\dots, \boldsymbol{\beta}^{G} that minimize

\sum\limits_{g=1}^{G}\left( \frac{1}{2n}\Vert \mathbf{y}-\mathbf{X} \boldsymbol{\beta}^{g}\Vert^{2} +\lambda_{S}\left( \frac{(1-\alpha)}{2}\Vert \boldsymbol{\beta}^{g}\Vert_{2}^{2}+\alpha \Vert \boldsymbol{ \beta \Vert_1}\right)+\frac{\lambda_{D}}{2}\sum\limits_{h\neq g}\sum_{j=1}^{p}\vert \beta_{j}^{h}\beta_{j}^{g}\vert \right),

over grids for the penalty parameters \lambda_{S} and \lambda_{D} that are built automatically. Larger values of \lambda_{S} encourage more sparsity within the models and larger values of \lambda_{D} encourage more diversity among them. If \lambda_{D}=0, then all of the models are equal to the Elastic Net regularized least squares estimator with penalty parameter \lambda_{S}. Optimal penalty parameters are found by num_folds cross-validation, where the prediction of the ensemble is formed by simple averaging. The predictors and the response are standardized to zero mean and unit variance before any computations are performed. The final output is in the original scales.

Value

An object of class cv.SplitReg, a list with entries

`betas`	Coefficients computed over the path of penalties for sparsity; the penalty for diversity is fixed at the optimal value.
`intercepts`	Intercepts for each of the models along the path of penalties for sparsity.
`index_opt`	Index of the optimal penalty parameter for sparsity.
`lambda_sparsity_opt`	Optimal penalty parameter for sparsity.
`lambda_diversity_opt`	Optimal penalty parameter for diversity.
`lambdas_sparsity`	Grid of sparsity parameters.
`lambdas_diversity`	Grid of diversity parameters.
`cv_mse_opt`	Optimal CV MSE.
`call`	The matched call.

Examples

library(MASS)
set.seed(1)
beta <- c(rep(5, 5), rep(0, 45))
Sigma <- matrix(0.5, 50, 50)
diag(Sigma) <- 1
x <- mvrnorm(50, mu = rep(0, 50), Sigma = Sigma)
y <- x %*% beta + rnorm(50)
fit <- cv.SplitReg(x, y, num_models=2)
coefs <- predict(fit, type="coefficients")

[Package SplitReg version 1.0.2 Index]