R: Randomly split dataset in multiple parts

split_random {RSSL}

R Documentation

Randomly split dataset in multiple parts

Description

The data.frame should start with a vector containing labels, or formula should be defined.

Usage

split_random(df, formula = NULL, splits = c(0.5, 0.5), min_class = 0)

Arguments

`df`	data.frame; Data frame of interest
`formula`	formula; Formula to indicate the outputs
`splits`	numeric; Probability of of assigning to each part, automatically normalized, should be >1
`min_class`	integer; minimum number of objects per class in each part

Value

list of data.frames

Examples

library(dplyr)

df <- generate2ClassGaussian(200,d=2)
dfs <- df %>% split_random(Class~.,split=c(0.5,0.3,0.2),min_class=1) 
names(dfs) <- c("Train","Validation","Test")
lapply(dfs,summary)

[Package RSSL version 0.9.7 Index]

Randomly split dataset in multiple parts

Description

Usage

Arguments

Value

See Also

Examples