bsrr {bestridge} | R Documentation |
Best subset ridge regression
Description
Best subset ridge regression for generalized linear model and Cox's proportional model.
Usage
bsrr(
x,
y,
family = c("gaussian", "binomial", "poisson", "cox"),
method = c("pgsection", "sequential", "psequential"),
tune = c("gic", "ebic", "bic", "aic", "cv"),
s.list,
lambda.list = 0,
s.min,
s.max,
lambda.min = 0.001,
lambda.max = 100,
nlambda = 100,
always.include = NULL,
screening.num = NULL,
normalize = NULL,
weight = NULL,
max.iter = 20,
warm.start = TRUE,
nfolds = 5,
group.index = NULL,
seed = NULL
)
Arguments
x |
Input matrix, of dimension |
y |
The response variable, of |
family |
One of the following models: |
method |
The method to be used to select the optimal model size and |
tune |
The criterion for choosing the model size and |
s.list |
An increasing list of sequential values representing the model
sizes. Only used for |
lambda.list |
A lambda sequence for |
s.min |
The minimum value of model sizes. Only used for |
s.max |
The maximum value of model sizes. Only used for |
lambda.min |
The minimum value of lambda. Only used for |
lambda.max |
The maximum value of lambda. Only used for |
nlambda |
The number of |
always.include |
An integer vector containing the indexes of variables that should always be included in the model. |
screening.num |
Users can pre-exclude some irrelevant variables according to maximum marginal likelihood estimators before fitting a
model by passing an integer to |
normalize |
Options for normalization. |
weight |
Observation weights. Default is |
max.iter |
The maximum number of iterations in the |
warm.start |
Whether to use the last solution as a warm start. Default
is |
nfolds |
The number of folds in cross-validation. Default is |
group.index |
A vector of integers indicating the which group each variable is in.
For variables in the same group, they should be located in adjacent columns of |
seed |
Seed to be used to divide the sample into K cross-validation folds. Default is |
Details
The best ridge regression problem with model size s
and the shrinkage parameter \lambda
is
\min_\beta -2 \log L(\beta) + \lambda\Vert\beta\Vert_2^2 \;\;{\rm
s.t.}\;\; \|\beta\|_0 \leq s.
In the GLM case, \log L(\beta)
is the
log likelihood function; In the Cox model, \log L(\beta)
is the log
partial likelihood function.
The best subset selection problem is a special case of the best ridge regression problem
with the shrinkage \lambda=0
.
For each candidate model size and \lambda
, the best subset ridge regression
problems are solved by the L_2
penalized primal-dual active set (PDAS) algorithm, see Wen et
al (2020) for details. This algorithm
utilizes an active set updating strategy via primal and dual variables and
fits the sub-model by exploiting the fact that their support sets are
non-overlap and complementary. For the case of method = "sequential"
if warm.start = "TRUE"
, we run the PDAS algorithm for a list of
sequential model sizes and use the estimate from the last iteration as a
warm start. For the case of
method = "psequential"
and method = "pgsection"
, the Powell method
using a sequential line search method or a golden section search technique is
used for parameters determination.
Value
A list with class attribute 'bsrr' and named components:
beta |
The best fitting coefficients. |
coef0 |
The best fitting intercept. |
loss |
The training loss of the best fitting model. |
ic |
The information criterion of the best fitting model when model selection is based on a certain information criterion. |
cvm |
The mean cross-validated error for the best fitting model when model selection is based on the cross-validation. |
lambda |
The lambda chosen for the best fitting model |
beta.all |
For |
coef0.all |
For |
loss.all |
For |
ic.all |
For |
cvm.all |
For |
lambda.all |
The lambda chosen for each step in |
family |
Type of the model. |
s.list |
The input
|
nsample |
The sample size. |
type |
Either |
method |
Method used for tuning parameters selection. |
ic.type |
The criterion of model selection. |
Author(s)
Liyuan Hu, Kangkang Jiang, Yanhang Zhang, Jin Zhu, Canhong Wen and Xueqin Wang.
See Also
plot.bsrr
, summary.bsrr
,
coef.bsrr
, predict.bsrr
.
Examples
#-------------------linear model----------------------#
# Generate simulated data
n <- 200
p <- 20
k <- 5
rho <- 0.4
seed <- 10
Tbeta <- rep(0, p)
Tbeta[1:k*floor(p/k):floor(p/k)] <- rep(1, k)
Data <- gen.data(n, p, k, rho, family = "gaussian", beta = Tbeta, seed = seed)
x <- Data$x[1:140, ]
y <- Data$y[1:140]
x_new <- Data$x[141:200, ]
y_new <- Data$y[141:200]
lambda.list <- exp(seq(log(5), log(0.1), length.out = 10))
lm.bsrr <- bsrr(x, y, method = "pgsection")
coef(lm.bsrr)
print(lm.bsrr)
summary(lm.bsrr)
pred.bsrr <- predict(lm.bsrr, newx = x_new)
# generate plots
plot(lm.bsrr)
#-------------------logistic model----------------------#
#Generate simulated data
Data <- gen.data(n, p, k, rho, family = "binomial", beta = Tbeta, seed = seed)
x <- Data$x[1:140, ]
y <- Data$y[1:140]
x_new <- Data$x[141:200, ]
y_new <- Data$y[141:200]
lambda.list <- exp(seq(log(5), log(0.1), length.out = 10))
logi.bsrr <- bsrr(x, y, family = "binomial", lambda.list = lambda.list)
coef(logi.bsrr)
print(logi.bsrr)
summary(logi.bsrr)
pred.bsrr <- predict(logi.bsrr, newx = x_new)
# generate plots
plot(logi.bsrr)
#-------------------poisson model----------------------#
Data <- gen.data(n, p, k, rho=0.3, family = "poisson", beta = Tbeta, seed = seed)
x <- Data$x[1:140, ]
y <- Data$y[1:140]
x_new <- Data$x[141:200, ]
y_new <- Data$y[141:200]
lambda.list <- exp(seq(log(5), log(0.1), length.out = 10))
poi.bsrr <- bsrr(x, y, family = "poisson", lambda.list = lambda.list)
coef(poi.bsrr)
print(poi.bsrr)
summary(poi.bsrr)
pred.bsrr <- predict(poi.bsrr, newx = x_new)
# generate plots
plot(poi.bsrr)
#-------------------coxph model----------------------#
#Generate simulated data
Data <- gen.data(n, p, k, rho, family = "cox", scal = 10, beta = Tbeta)
x <- Data$x[1:140, ]
y <- Data$y[1:140, ]
x_new <- Data$x[141:200, ]
y_new <- Data$y[141:200, ]
lambda.list <- exp(seq(log(5), log(0.1), length.out = 10))
cox.bsrr <- bsrr(x, y, family = "cox", lambda.list = lambda.list)
coef(cox.bsrr)
print(cox.bsrr)
summary(cox.bsrr)
pred.bsrr <- predict(cox.bsrr, newx = x_new)
# generate plots
plot(cox.bsrr)
#----------------------High dimensional linear models--------------------#
## Not run:
data <- gen.data(n, p = 1000, k, family = "gaussian", seed = seed)
# Best subset selection with SIS screening
lm.high <- bsrr(data$x, data$y, screening.num = 100)
## End(Not run)
#-------------------group selection----------------------#
beta <- rep(c(rep(1,2),rep(0,3)), 4)
Data <- gen.data(200, 20, 5, rho=0.4, beta = beta, seed =10)
x <- Data$x
y <- Data$y
group.index <- c(rep(1, 2), rep(2, 3), rep(3, 2), rep(4, 3),
rep(5, 2), rep(6, 3), rep(7, 2), rep(8, 3))
lm.groupbsrr <- bsrr(x, y, s.min = 1, s.max = 8, group.index = group.index)
coef(lm.groupbsrr)
print(lm.groupbsrr)
summary(lm.groupbsrr)
pred.groupl0l2 <- predict(lm.groupbsrr, newx = x_new)
#-------------------include specified variables----------------------#
Data <- gen.data(n, p, k, rho, family = "gaussian", beta = Tbeta, seed = seed)
lm.bsrr <- bsrr(Data$x, Data$y, always.include = 2)