| split-methods {rebmix} | R Documentation | 
Splits Dataset into Train and Test Datasets
Description
Returns (invisibly) the object containing train and test observations \bm{y}_{1}, \ldots, \bm{y}_{n} as well as true class membership \bm{\Omega}_{g} for the test dataset.
Usage
## S4 method for signature 'numeric'
split(p = 0.75, Dataset = data.frame(), class = numeric(), ...)
## S4 method for signature 'list'
split(p = list(), Dataset = data.frame(), class = numeric(), ...)
## ... and for other signatures
Arguments
| p | see Methods section below. | 
| Dataset | a data frame containing dataset  | 
| class | a column number in  | 
| ... | further arguments to  | 
Value
Returns an object of class RCLS.chunk.
Methods
- signature(p = "numeric")
- a number specifying the fraction of observations for training - 0.0 \leq p \leq 1.0. The default value is- 0.75.
- signature(p = "list")
- a list composed of column number - p$typein- Datasetcontaining the type membership information followed by the corresponding train- p$trainand test- p$testvalues. The default value is- list().
Author(s)
Marko Nagode
Examples
## Not run: 
data(iris)
# Split dataset into train (75
set.seed(5)
Iris <- split(p = 0.75, Dataset = iris, class = 5)
Iris
# Generate simulated dataset.
N <- 1000
class <- c(rep("A", 0.4 * N), rep("B", 0.2 * N),
  rep("C", 0.1 * N), rep("D", 0.05 * N), rep("E", 0.25 * N))
type <- c(rep("train", 0.75 * N), rep("test", 0.25 * N))
n <- 300
Dataset <- data.frame(1:n, sample(class, n))
colnames(Dataset) <- c("y", "class")
# Split dataset into train (60
simulated <- split(p = 0.6, Dataset = Dataset, class = 2)
simulated
# Generate simulated dataset.
Dataset <- data.frame(1:n, sample(class, n), sample(type, n))
colnames(Dataset) <- c("y", "class", "type")
# Split dataset into train and test subsets.
simulated <- split(p = list(type = 3, train = "train",
  test = "test"), Dataset = Dataset, class = 2)
simulated
## End(Not run)