R: Create a train/test split of a dataset

gensvm.train.test.split {gensvm}

R Documentation

Create a train/test split of a dataset

Description

Often it is desirable to split a dataset into a training and testing sample. This function is included in GenSVM to make it easy to do so. The function is inspired by a similar function in Scikit-Learn.

Usage

gensvm.train.test.split(
  x,
  y = NULL,
  train.size = NULL,
  test.size = NULL,
  shuffle = TRUE,
  random.state = NULL,
  return.idx = FALSE
)

Arguments

`x`	array to split
`y`	another array to split (typically this is a vector)
`train.size`	size of the training dataset. This can be provided as float or as int. If it's a float, it should be between 0.0 and 1.0 and represents the fraction of the dataset that should be placed in the training dataset. If it's an int, it represents the exact number of samples in the training dataset. If it is NULL, the complement of `test.size` will be used.
`test.size`	size of the test dataset. Similarly to train.size both a float or an int can be supplied. If it's NULL, the complement of train.size will be used. If both train.size and test.size are NULL, a default test.size of 0.25 will be used.
`shuffle`	shuffle the rows or not
`random.state`	seed for the random number generator (int)
`return.idx`	whether or not to return the indices in the output

Value

a list with x.train and x.test splits of the x array provided. If y is provided, also y.train and y.test. If return.idx is TRUE, also idx.train and idx.test.

Author(s)

Gerrit J.J. van den Burg, Patrick J.F. Groenen
Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>

References

Van den Burg, G.J.J. and Groenen, P.J.F. (2016). GenSVM: A Generalized Multiclass Support Vector Machine, Journal of Machine Learning Research, 17(225):1–42. URL https://jmlr.org/papers/v17/14-526.html.

Examples

x <- iris[, -5]
y <- iris[, 5]

# using the default values
split <- gensvm.train.test.split(x, y)

# using the split in a GenSVM model
fit <- gensvm(split$x.train, split$y.train)
gensvm.accuracy(split$y.test, predict(fit, split$x.test))

# using attach makes the results directly available
attach(gensvm.train.test.split(x, y))
fit <- gensvm(x.train, y.train)
gensvm.accuracy(y.test, predict(fit, x.test))

[Package gensvm version 0.1.7 Index]