gensvm.train.test.split {gensvm} | R Documentation |
Create a train/test split of a dataset
Description
Often it is desirable to split a dataset into a training and testing sample. This function is included in GenSVM to make it easy to do so. The function is inspired by a similar function in Scikit-Learn.
Usage
gensvm.train.test.split(
x,
y = NULL,
train.size = NULL,
test.size = NULL,
shuffle = TRUE,
random.state = NULL,
return.idx = FALSE
)
Arguments
x |
array to split |
y |
another array to split (typically this is a vector) |
train.size |
size of the training dataset. This can be provided as
float or as int. If it's a float, it should be between 0.0 and 1.0 and
represents the fraction of the dataset that should be placed in the training
dataset. If it's an int, it represents the exact number of samples in the
training dataset. If it is NULL, the complement of |
test.size |
size of the test dataset. Similarly to train.size both a float or an int can be supplied. If it's NULL, the complement of train.size will be used. If both train.size and test.size are NULL, a default test.size of 0.25 will be used. |
shuffle |
shuffle the rows or not |
random.state |
seed for the random number generator (int) |
return.idx |
whether or not to return the indices in the output |
Value
a list with x.train
and x.test
splits of the x
array provided. If y
is provided, also y.train
and
y.test
. If return.idx
is TRUE, also idx.train
and
idx.test
.
Author(s)
Gerrit J.J. van den Burg, Patrick J.F. Groenen
Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
References
Van den Burg, G.J.J. and Groenen, P.J.F. (2016). GenSVM: A Generalized Multiclass Support Vector Machine, Journal of Machine Learning Research, 17(225):1–42. URL https://jmlr.org/papers/v17/14-526.html.
See Also
Examples
x <- iris[, -5]
y <- iris[, 5]
# using the default values
split <- gensvm.train.test.split(x, y)
# using the split in a GenSVM model
fit <- gensvm(split$x.train, split$y.train)
gensvm.accuracy(split$y.test, predict(fit, split$x.test))
# using attach makes the results directly available
attach(gensvm.train.test.split(x, y))
fit <- gensvm(x.train, y.train)
gensvm.accuracy(y.test, predict(fit, x.test))