segmkf {rchemo}R Documentation

Segments for cross-validation

Description

Build segments of observations for K-Fold or "test-set" cross-validation (CV).

The CV can eventually be randomly repeated. For each repetition:

- K-fold CV - Function segmkf returns the K segments.

- Test-set CV - Function segmts returns a segment (of a given length) randomly sampled in the dataset.

CV of blocks

Argument y allows sampling blocks of observations instead of observations. This can be required when there are repetitions in the data. In such a situation, CV should account for the repetition level (if not, the error rates are in general highly underestimated). For implementing such a CV, object y must be a a vector (n) defining the blocks, in the same order as in the data.

In any cases (y = NULL or not), the functions return a list of vector(s). Each vector contains the indexes of the observations defining the segment.

Usage


segmkf(n, y = NULL, K = 5, 
    type = c("random", "consecutive", "interleaved"), nrep = 1) 

segmts(n, y = NULL, m, nrep) 

Arguments

n

The total number of row observations in the dataset. If y = NULL, the CV is implemented on 1:n. If y != NULL, blocks of observations (defined in y) are sampled instead of observations (but indexes of observations are returned).

y

A vector (n) defining the blocks. Default to NULL.

K

For segmkf.The number of folds (i.e. segments) in the K-fold CV.

type

For segmkf.The type K-fold CV. Possible values are "random" (default), "consecutive" and "interleaved".

m

For segmts. If y = NULL, the number of observations in the segment. If not, the number of blocks in the segment.

nrep

The number of replications of the repeated CV. Default to nrep = 1.

Value

The segments (lists of indexes).

Examples


Kfold <- segmkf(n = 10, K = 3)

interleavedKfold <- segmkf(n = 10, K = 3, type = "interleaved")

LeaveOneOut <- segmkf(n = 10, K = 10)

RepeatedKfold <- segmkf(n = 10, K = 3, nrep = 2)

repeatedTestSet <- segmts(n = 10, m = 3, nrep = 5)

n <- 10
y <- rep(LETTERS[1:5], 2)
y

Kfold_withBlocks <- segmkf(n = n, y = y, K = 3, nrep = 1)
z <- Kfold_withBlocks 
z
y[z$rep1$segm1]
y[z$rep1$segm2]
y[z$rep1$segm3]

TestSet_withBlocks <- segmts(n = n, y = y, m = 3, nrep = 1)
z <- TestSet_withBlocks
z
y[z$rep1$segm1]


[Package rchemo version 0.1-1 Index]