R: Unstratified cross validation

unstratified.cv.data {HEMDAG}

R Documentation

Unstratified cross validation

Description

This function splits a dataset in k-fold in an unstratified way, i.e. a fold does not contain an equal amount of positive and negative examples. This function is used to perform k-fold cross-validation experiments in a hierarchical correction contest where splitting dataset in a stratified way is not needed.

Usage

unstratified.cv.data(S, kk = 5, seed = NULL)

Arguments

`S`	matrix of the flat scores. It must be a named matrix, where rows are example (e.g. genes) and columns are classes/terms (e.g. GO terms).
`kk`	number of folds in which to split the dataset (`def. k=5`).
`seed`	seed for random generator. If `NULL` (def.) no initialization is performed.

Value

A list with k=kk components (folds). Each component of the list is a character vector contains the index of the examples, i.e. the index of the rows of the matrix S.

Examples

data(scores);
foldIndex <- unstratified.cv.data(S, kk=5, seed=23);

[Package HEMDAG version 2.7.4 Index]