repeatcv {nestedcv} | R Documentation |
Repeated nested CV
Description
Performs repeated calls to a nestedcv
model to determine performance across
repeated runs of nested CV.
Usage
repeatcv(
expr,
n = 5,
repeat_folds = NULL,
keep = TRUE,
extra = FALSE,
progress = TRUE,
rep.cores = 1L
)
Arguments
expr |
An expression containing a call to |
n |
Number of repeats |
repeat_folds |
Optional list containing fold indices to be applied to the outer CV folds. |
keep |
Logical whether to save repeated outer CV predictions for ROC curves etc. |
extra |
Logical whether additional performance metrics are gathered for
binary classification models. See |
progress |
Logical whether to show progress. |
rep.cores |
Integer specifying number of cores/threads to invoke. |
Details
We recommend using this with the R pipe |>
(see examples).
When comparing models, it is recommended to fix the sets of outer CV folds
used across each repeat for comparing performance between models. The
function repeatfolds()
can be used to create a fixed set of outer CV folds
for each repeat.
Parallelisation over repeats is performed using parallel::mclapply
(not
available on windows). Beware that cv.cores
can still be set within calls
to nestedcv
models (= nested parallelisation). This means that rep.cores
x cv.cores
number of processes/forks will be spawned, so be careful not to
overload your CPU. In general parallelisation of repeats using rep.cores
is
faster than parallelisation using cv.cores
.
Value
List of S3 class 'repeatcv' containing:
call |
the model call |
result |
matrix of performance metrics |
output |
(if |
roc |
(binary classification models only) a ROC curve object based on
predictions across all repeats as returned in |
Examples
data("iris")
dat <- iris
y <- dat$Species
x <- dat[, 1:4]
res <- nestcv.glmnet(y, x, family = "multinomial", alphaSet = 1,
n_outer_folds = 4) |>
repeatcv(3, rep.cores = 2)
res
summary(res)
## set up fixed fold indices
set.seed(123, "L'Ecuyer-CMRG")
folds <- repeatfolds(y, repeats = 3, n_outer_folds = 4)
res <- nestcv.glmnet(y, x, family = "multinomial", alphaSet = 1,
n_outer_folds = 4) |>
repeatcv(3, repeat_folds = folds, rep.cores = 2)
res