segmkf {rchemo} | R Documentation |
Segments for cross-validation
Description
Build segments of observations for K-Fold or "test-set" cross-validation (CV).
The CV can eventually be randomly repeated. For each repetition:
- K-fold CV - Function segmkf
returns the K
segments.
- Test-set CV - Function segmts
returns a segment (of a given length) randomly sampled in the dataset.
CV of blocks
Argument y
allows sampling blocks of observations instead of observations. This can be required when there are repetitions in the data. In such a situation, CV should account for the repetition level (if not, the error rates are in general highly underestimated). For implementing such a CV, object y
must be a a vector (n
) defining the blocks, in the same order as in the data.
In any cases (y = NULL
or not), the functions return a list of vector(s). Each vector contains the indexes of the observations defining the segment.
Usage
segmkf(n, y = NULL, K = 5,
type = c("random", "consecutive", "interleaved"), nrep = 1)
segmts(n, y = NULL, m, nrep)
Arguments
n |
The total number of row observations in the dataset. If |
y |
A vector ( |
K |
For |
type |
For |
m |
For |
nrep |
The number of replications of the repeated CV. Default to |
Value
The segments (lists of indexes).
Examples
Kfold <- segmkf(n = 10, K = 3)
interleavedKfold <- segmkf(n = 10, K = 3, type = "interleaved")
LeaveOneOut <- segmkf(n = 10, K = 10)
RepeatedKfold <- segmkf(n = 10, K = 3, nrep = 2)
repeatedTestSet <- segmts(n = 10, m = 3, nrep = 5)
n <- 10
y <- rep(LETTERS[1:5], 2)
y
Kfold_withBlocks <- segmkf(n = n, y = y, K = 3, nrep = 1)
z <- Kfold_withBlocks
z
y[z$rep1$segm1]
y[z$rep1$segm2]
y[z$rep1$segm3]
TestSet_withBlocks <- segmts(n = n, y = y, m = 3, nrep = 1)
z <- TestSet_withBlocks
z
y[z$rep1$segm1]