dataMining {MoTBFs} | R Documentation |
Data pre-processing utilities
Description
Collection of functions for discretizing, standardizing, converting factors to characters and other usufull methods for pre-processing datasets.
Usage
whichDiscrete(dataset, discreteVariables)
discreteVariables_as.character(dataset, discreteVariables)
standardizeDataset(dataset)
discretizeVariablesEWdis(dataset, numIntervals, factor = FALSE, binary = FALSE)
discreteVariablesStates(namevariables, discreteData)
nstates(DiscreteVariablesStates)
quantileIntervals(X, numIntervals)
scaleData(dataset, scale)
Arguments
dataset |
A dataset of class |
discreteVariables |
A |
numIntervals |
Number of bins used to discretize the continuous variables. |
factor |
A boolean value indicating if the variables should be considered as
|
binary |
By default it is set to |
namevariables |
an array with the names of the varibles. |
discreteData |
A discretized dataset of class |
DiscreteVariablesStates |
The output of the function |
X |
A |
scale |
A |
Details
whichDiscrete()
selects the position of the discrete variables.
discreteVariables_as.character()
transforms the values of the discrete variables into character values.
standardizeDataset()
standardizes all the variables in a data set.
discretizeVariablesEWdis()
discretizes the continuous variables in a dataset using
equal width binning.
discreteVariablesStates()
extracts the states of the qualitative variables.
nstates()
computes the number of different values of the discrete variables.
quantileIntervals()
gets the quantiles of a variable taking into account the number of intervals
into which its domain is splitted.
Examples
## dataset: 2 continuous variables, 1 discrete variable.
data <- data.frame(X = rnorm(100),Y = rexp(100,1/2), Z = as.factor(rep(c("s","a"), 50)))
disVar <- "Z" ## Discrete variable
class(data[,disVar]) ## factor
data <- discreteVariables_as.character(dataset = data, discreteVariables = disVar)
class(data[,disVar]) ## character
whichDiscrete(dataset = data, discreteVariables = "Z")
standData <- standardizeDataset(dataset = data)
disData <- discretizeVariablesEWdis(dataset = data, numIntervals = 3)
l <- discreteVariablesStates(namevariables = names(data), discreteData = disData)
nstates(DiscreteVariablesStates = l)
## Continuous variables
quantileIntervals(X = data[,1], numIntervals = 4)
quantileIntervals(X = data[,2], numIntervals = 10)