rarefit {rare} | R Documentation |
Fit the rare feature selection model
Description
Fit the rare feature selection model proposed in Yan and Bien (2018):
using an alternating direction method of multipliers (ADMM) algorithm
described in Algorithm 1 of the same paper.
The regularization path is computed over a two-dimensional grid of
regularization parameters: lambda
and alpha
. Of the two,
lambda
controls the overall amount of regularization, and alpha
controls the tradeoff between sparsity and fusion of (larger
alpha
induces more fusion in ).
Usage
rarefit(y, X, A = NULL, Q = NULL, hc, intercept = T, lambda = NULL,
alpha = NULL, nlam = 50, lam.min.ratio = 1e-04, nalpha = 10,
rho = 0.01, eps1 = 1e-06, eps2 = 1e-05, maxite = 1e+06)
Arguments
y |
Length- |
X |
|
A |
|
Q |
|
hc |
An |
intercept |
Whether intercept be fitted (default = TRUE) or set to zero (FALSE). |
lambda |
A user-supplied |
alpha |
A user-supplied |
nlam |
Number of |
lam.min.ratio |
Smallest value for |
nalpha |
Number of |
rho |
Penalty parameter for the quadratic penalty in the ADMM algorithm.
The default value is |
eps1 |
Convergence threshold in terms of the absolute tolerance level
for the ADMMM algorithm. The default value is |
eps2 |
Convergence threshold in terms of the relative tolerance level
for the ADMM algorithm. The default value is |
maxite |
Maximum number of passes over the data for every pair of
( |
Details
The function splits model fitting path by alpha
. At each alpha
value,
the model is fit on the entire sequence of lambda
with warm start. We recommend
including an intercept (by setting intercept=T
) unless the input data have been
centered.
Value
Returns regression coefficients for beta
and gamma
and
intercept beta0
. We use a matrix-nested-within-list structure to store the coefficients: each list
item corresponds to an alpha
value; matrix (or vector) in that list item stores
coefficients at various lambda
values by columns (or entries).
beta0 |
Length- |
beta |
Length- |
gamma |
Length- |
lambda |
Sequence of |
alpha |
Sequence of |
A |
Binary matrix encoding ancestor-descendant relationship between leaves and nodes in the tree. |
Q |
Matrix with columns forming an orthonormal basis for the null space of |
intercept |
Whether an intercept is included in model fit. |
References
Yan, X. and Bien, J. (2018) Rare Feature Selection in High Dimensions, https://arxiv.org/abs/1803.06675.
See Also
Examples
## Not run:
# See vignette for more details.
set.seed(100)
ts <- sample(1:length(data.rating), 400) # Train set indices
# Fit the model on train set
ourfit <- rarefit(y = data.rating[ts], X = data.dtm[ts, ], hc = data.hc, lam.min.ratio = 1e-6,
nlam = 20, nalpha = 10, rho = 0.01, eps1 = 1e-5, eps2 = 1e-5, maxite = 1e4)
## End(Not run)