| data_disc {sbfc} | R Documentation |
Data set discretization and formatting
Description
Removes rows containing missing data, and discretizes the data set using Minimum Description Length Partitioning (MDLP).
Usage
data_disc(data, n_train = NULL, missing = "?")
Arguments
data |
Data frame, where the last column must be the class variable. |
n_train |
Number of data frame rows to use as the training set - the rest are used for the test set. If NULL, all rows are used for training, and there is no test set (default=NULL). |
missing |
Label that denotes missing values in your data frame (default='?'). |
Value
A discretized data set:
TrainXMatrix containing the training data.
TrainYVector containing the class labels for the training data.
TestXMatrix containing the test data (optional).
TestYVector containing the class labels for the test data (optional).
Examples
data(iris)
iris_disc = data_disc(iris)
[Package sbfc version 1.0.3 Index]