stratified.cross.validation {HEMDAG}R Documentation

Stratified cross validation

Description

Generate data for the stratified cross-validation.

Usage

stratified.cv.data.single.class(examples, positives, kk = 5, seed = NULL)

stratified.cv.data.over.classes(labels, examples, kk = 5, seed = NULL)

Arguments

examples

indices or names of the examples. Can be either a vector of integers or a vector of names.

positives

vector of integers or vector of names. The indices (or names) refer to the indices (or names) of 'positive' examples.

kk

number of folds (def. kk=5).

seed

seed of the random generator (def. seed=NULL). If is set to NULL no initialization is performed.

labels

labels matrix. Rows are genes and columns are classes. Let's denote M the labels matrix. If M[i,j]=1, means that the gene i is annotated with the class j, otherwise M[i,j]=0.

Details

Folds are stratified, i.e. contain the same amount of positive and negative examples.

Value

stratified.cv.data.single.class returns a list with 2 two component:

stratified.cv.data.over.classes returns a list with n components, where n is the number of classes of the labels matrix. Each component n is in turn a list with k elements, where k is the number of folds. Each fold contains an equal amount of positives and negatives examples.

Examples

data(labels);
examples.index <- 1:nrow(L);
examples.name <- rownames(L);
positives <- which(L[,3]==1);
x <- stratified.cv.data.single.class(examples.index, positives, kk=5, seed=23);
y <- stratified.cv.data.single.class(examples.name, positives, kk=5, seed=23);
z <- stratified.cv.data.over.classes(L, examples.index, kk=5, seed=23);
k <- stratified.cv.data.over.classes(L, examples.name, kk=5, seed=23);

[Package HEMDAG version 2.7.4 Index]