crossval {quadrupen} | R Documentation |
Cross-validation function for quadrupen fitting methods.
Description
Function that computes K-fold (double) cross-validated error of a
quadrupen
fit. If no lambda2
is provided, simple
cross validation on the lambda1
parameter is performed. If
a vector lambda2
is passed as an argument, double
cross-validation is performed.
Usage
crossval(
x,
y,
penalty = c("elastic.net", "bounded.reg"),
K = 10,
folds = split(sample(1:nrow(x)), rep(1:K, length = nrow(x))),
lambda2 = 0.01,
verbose = TRUE,
mc.cores = 2,
...
)
Arguments
x |
matrix of features, possibly sparsely encoded (experimental). Do NOT include intercept. |
y |
response vector. |
penalty |
a string for the fitting procedure used for
cross-validation. Either |
K |
integer indicating the number of folds. Default is 10. |
folds |
list of |
lambda2 |
tunes the |
verbose |
logical; indicates if the progression (the current
lambda2) should be displayed. Default is |
mc.cores |
the number of cores to use. The default uses 2 cores. |
... |
additional parameters to overwrite the defaults of the
fitting procedure identified by the |
Value
An object of class "cvpen" for which a plot
method
is available.
Note
If the user runs the fitting method with option
'bulletproof'
set to FALSE
, the algorithm may stop
at an early stage of the path. Early stops are handled internally,
in order to provide results on the same grid of penalty tuned by
\lambda_1
. This is done by means of NA
values, so as mean and standard error are consistently
evaluated. If, while cross-validating, the procedure experiences
too many early stoppings, a warning is sent to the user, in which
case you should reconsider the grid of lambda1
used for the
cross-validation. If bulletproof
is TRUE
(the
default), there is nothing to worry about, except a possible slow
down when any switching to the proximal algorithm is required.
See Also
quadrupen
, plot,cvpen-method
and cvpen
.
Examples
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variable
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo) + 0.1
diag(Sigma) <- 1
n <- 100
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Use fewer lambda1 values by overwritting the default parameters
## and cross-validate over the sequences lambda1 and lambda2
cv.double <- crossval(x,y, lambda2=10^seq(2,-2,len=50), nlambda1=50)
## Rerun simple cross-validation with the appropriate lambda2
cv.10K <- crossval(x,y, lambda2=0.2)
## Try leave one out also
cv.loo <- crossval(x,y, K=n, lambda2=0.2)
plot(cv.double)
plot(cv.10K)
plot(cv.loo)
## Performance for selection purpose
beta.min.10K <- slot(cv.10K, "beta.min")
beta.min.loo <- slot(cv.loo, "beta.min")
cat("\nFalse positives with the minimal 10-CV choice: ", sum(sign(beta) != sign(beta.min.10K)))
cat("\nFalse positives with the minimal LOO-CV choice: ", sum(sign(beta) != sign(beta.min.loo)))