data_disc {sbfc} | R Documentation |
Data set discretization and formatting
Description
Removes rows containing missing data, and discretizes the data set using Minimum Description Length Partitioning (MDLP).
Usage
data_disc(data, n_train = NULL, missing = "?")
Arguments
data |
Data frame, where the last column must be the class variable. |
n_train |
Number of data frame rows to use as the training set - the rest are used for the test set. If NULL, all rows are used for training, and there is no test set (default=NULL). |
missing |
Label that denotes missing values in your data frame (default='?'). |
Value
A discretized data set:
TrainX
Matrix containing the training data.
TrainY
Vector containing the class labels for the training data.
TestX
Matrix containing the test data (optional).
TestY
Vector containing the class labels for the test data (optional).
Examples
data(iris)
iris_disc = data_disc(iris)
[Package sbfc version 1.0.3 Index]