R: Samples along the class labels

Class.sample {shipunov}

R Documentation

Samples along the class labels

Description

Stratified sampling: sample separately within each class

Usage

Class.sample(lbls, nsam=NULL, prop=NULL, uniform=FALSE)

Arguments

`lbls`	Vector of labels convertable into factor
`nsam`	Number of samples to take from each class
`prop`	Proportion of samples to take from each class
`uniform`	Uniform instead of random?

Details

'Class.sample()' splits labels into groups in accordance with classes, and samples each of them separately. If 'prop' is specified, then number of samples in each class calculated separately from this value. Of both 'nsam' and 'prop' specified, preference is given to 'prop'.

Uniform method samples each n-th member of the class to reach the desired sample size.

If sample size is bigger then class size, the whole class will be sampled.

Class.sample() uses the ave() internally, and can be easily extended, for example, to make k-fold sampling, like:

ave(seq_along(lbls), lbls, FUN=function(.x) cut(sample(length(.x)), breaks=k, labels=FALSE))

Value

Logical vector of length equal to 'vector'

Author(s)

Alexey Shipunov

Examples


(sam <- Class.sample(iris$Species, nsam=5))
iris.trn <- iris[sam, ]
iris.tst <- iris[!sam, ]

(sample1 <- Class.sample(iris$Species, nsam=10))
table(iris$Species, sample1)
(sample2 <- Class.sample(iris$Species, prop=0.2))
table(iris$Species, sample2)
(sample3 <- Class.sample(iris$Species, nsam=10, uniform=TRUE))
table(iris$Species, sample3)
(sample4 <- Class.sample(iris$Species, prop=0.2, uniform=TRUE))
table(iris$Species, sample4)

[Package shipunov version 1.17.1 Index]