| cv_partition {sparsediscrim} | R Documentation |
Randomly partitions data for cross-validation.
Description
For a vector of training labels, we return a list of cross-validation folds,
where each fold has the indices of the observations to leave out in the fold.
In terms of classification error rate estimation, one can think of a fold as a
the observations to hold out as a test sample set. Either the hold_out
size or the number of folds, num_folds, can be specified. The number
of folds defaults to 10, but if the hold_out size is specified, then
num_folds is ignored.
Usage
cv_partition(y, num_folds = 10, hold_out = NULL, seed = NULL)
Arguments
y |
a vector of class labels |
num_folds |
the number of cross-validation folds. Ignored if
|
hold_out |
the hold-out size for cross-validation. See Details. |
seed |
optional random number seed for splitting the data for cross-validation |
Details
We partition the vector y based on its length, which we treat as the
sample size, 'n'. If an object other than a vector is used in y, its
length can yield unexpected results. For example, the output of
length(diag(3)) is 9.
Value
list the indices of the training and test observations for each fold.
Examples
# The following three calls to `cv_partition` yield the same partitions.
set.seed(42)
cv_partition(iris$Species)
cv_partition(iris$Species, num_folds = 10, seed = 42)
cv_partition(iris$Species, hold_out = 15, seed = 42)