cv.sprinter {sprintr}R Documentation

Running sprinter with cross-validation

Description

The main cross-validation function to select the best sprinter fit for a path of tuning parameters.

Usage

cv.sprinter(x, y, num_keep = NULL, square = FALSE, lambda = NULL,
  nlam = 100, lam_min_ratio = ifelse(nrow(x) < ncol(x), 0.01, 1e-04),
  nfold = 5, foldid = NULL)

Arguments

x

An n by p design matrix of main effects. Each row is an observation of p main effects.

y

A response vector of size n.

num_keep

Number of candidate interactions to keep in Step 2. If num_keep is not specified (as default), it will be set to [n / log n].

square

Indicator of whether squared effects should be fitted in Step 1. Default to be FALSE.

lambda

A user specified list of tuning parameter. Default to be NULL, and the program will compute its own lambda path based on nlam and lam_min_ratio.

nlam

The number of lambda values. Default value is 100.

lam_min_ratio

The ratio of the smallest and the largest values in lambda. The largest value in lambda is usually the smallest value for which all coefficients are set to zero. Default to be 1e-2 in the n < p setting.

nfold

Number of folds in cross-validation. Default value is 5. If each fold gets too view observation, a warning is thrown and the minimal nfold = 3 is used.

foldid

A vector of length n representing which fold each observation belongs to. Default to be NULL, and the program will generate its own randomly.

Value

An object of S3 class "sprinter".

n

The sample size.

p

The number of main effects.

a0

estimate of intercept corresponding to the CV-selected model.

compact

A compact representation of the selected variables. compact has three columns, with the first two columns representing the indices of a selected variable (main effects with first index = 0), and the last column representing the estimate of coefficients.

fit

The whole glmnet fit object in Step 3.

fitted

fitted value of response corresponding to the CV-selected model.

lambda

The sequence of lambda values used.

cvm

The averaged estimated prediction error on the test sets over K folds.

cvsd

The standard error of the estimated prediction error on the test sets over K folds.

foldid

Fold assignment. A vector of length n.

ibest

The index in lambda that is chosen by CV.

call

Function call.

See Also

predict.cv.sprinter

Examples

n <- 100
p <- 200
x <- matrix(rnorm(n * p), n, p)
y <- x[, 1] - 2 * x[, 2] + 3 * x[, 1] * x[, 3] - 4 * x[, 4] * x[, 5] + rnorm(n)
mod <- cv.sprinter(x = x, y = y)


[Package sprintr version 0.9.0 Index]