splitDataset {gecko} | R Documentation |
Split a dataset for model training
Description
Split a dataset for model training while keeping class representativity.
Usage
splitDataset(data, proportion)
Arguments
data |
dataframe. Containg some sort of classification data. The last column must contain the label data. |
proportion |
numeric. A value between 0 a 1 determining the proportion of the dataset split between training and testing. |
Value
list. First element is the train data, second element is the test data.
Examples
# Binary label case
my_data = data.frame(X = runif(20), Y = runif(20), Z = runif(20), Label =
c(rep("presence", 10), rep("outlier", 10)) )
splitDataset(my_data, 0.8)
# Multi label case
my_data = data.frame(X = runif(60), Y = runif(60), Z = runif(60), Label =
c(rep("A", 20), rep("B", 30), rep("C", 10)) )
splitDataset(my_data, 0.8)
[Package gecko version 1.0.0 Index]