sgp.cv {SGPR} | R Documentation |
Cross-validation for sparse group penalties
Description
A function that performs k-fold cross-validation for sparse group penalties for a lambda sequence.
Usage
sgp.cv(
X,
y,
group = 1:ncol(X),
Z = NULL,
...,
nfolds = 10,
seed,
fold,
type,
returnY = FALSE,
print.trace = FALSE
)
Arguments
X |
The design matrix without intercept with the variables to be selected. |
y |
The response vector. |
group |
A vector indicating the group membership of each variable in X. |
Z |
The design matrix of the variables to be included in the model without penalization. |
... |
Other parameters of underlying basic functions. |
nfolds |
The number of folds for cross-validation. |
seed |
A seed provided by the user for the random number generator. |
fold |
A vector of folds specified by the user (default is a random assignment). |
type |
A string indicating the type of regression model (linear or binomial). |
returnY |
A Boolean value indicating whether the fitted values should be returned. |
print.trace |
A Boolean value that specifies whether the beginning of a fold should be printed. |
Value
A list containing:
- cve
The average cross-validation error for each value of lambda.
- cvse
The estimated standard error for each value of cve.
- lambdas
The sequence of lambda values.
- fit
The sparse group penalty model fitted to the entire data.
- fold
The fold assignments for each observation for the cross-validation procedure.
- min
The index of lambda corresponding to the minimum cross-validation error.
- lambda.min
The value of lambda with the minimum cross-validation error.
- null.dev
The deviance for the empty model.
- pe
The cross-validation prediction error for each value of lambda (for binomial only).
- pred
The fitted values from the cross-validation folds.
Examples
# Generate data
n <- 100
p <- 200
nr <- 10
g <- ceiling(1:p / nr)
X <- matrix(rnorm(n * p), n, p)
b <- c(-3:3)
y_lin <- X[, 1:length(b)] %*% b + 5 * rnorm(n)
y_log <- rbinom(n, 1, exp(y_lin) / (1 + exp(y_lin)))
# Linear regression
lin_fit <- sgp.cv(X, y_lin, g, type = "linear", penalty = "sgl")
plot(lin_fit)
predict(lin_fit, extract = "vars")
lin_fit <- sgp.cv(X, y_lin, g, type = "linear", penalty = "sgs")
plot(lin_fit)
predict(lin_fit, extract = "vars")
lin_fit <- sgp.cv(X, y_lin, g, type = "linear", penalty = "sgm")
plot(lin_fit)
predict(lin_fit, extract = "vars")
lin_fit <- sgp.cv(X, y_lin, g, type = "linear", penalty = "sge")
plot(lin_fit)
predict(lin_fit, extract = "vars")
# Logistic regression
log_fit <- sgp.cv(X, y_log, g, type = "logit", penalty = "sgl")
plot(log_fit)
predict(log_fit, extract = "vars")
log_fit <- sgp.cv(X, y_log, g, type = "logit", penalty = "sgs")
plot(log_fit)
predict(log_fit, extract = "vars")
log_fit <- sgp.cv(X, y_log, g, type = "logit", penalty = "sgm")
plot(log_fit)
predict(log_fit, extract = "vars")
log_fit <- sgp.cv(X, y_log, g, type = "logit", penalty = "sge")
plot(log_fit)
predict(log_fit, extract = "vars")