discretizeDF.supervised {arulesCBA}R Documentation

Supervised Methods to Convert Continuous Variables into Categorical Variables

Description

This function implements several supervised methods to convert continuous variables into a categorical variables (factor) suitable for association rule mining and building associative classifiers. A whole data.frame is discretized (i.e., all numeric columns are discretized).

Usage

discretizeDF.supervised(formula, data, method = "mdlp", dig.lab = 3, ...)

Arguments

formula

a formula object to specify the class variable for supervised discretization and the predictors to be discretized in the form class ~ . or class ~ predictor1 + predictor2.

data

a data.frame containing continuous variables to be discretized

method

discretization method. Available are: “"mdlp"⁠, ⁠"caim"', '"cacc"', '"ameva"', '"chi2"', '"chimerge"', '"extendedchi2"', and '"modchi2"'.

dig.lab

integer; number of digits used to create labels.

...

Additional parameters are passed on to the implementation of the chosen discretization method.

Details

discretizeDF.supervised() only implements supervised discretization. See arules::discretizeDF() in package arules for unsupervised discretization.

Value

discretizeDF() returns a discretized data.frame. Discretized columns have an attribute "discretized:breaks" indicating the used breaks or and "discretized:method" giving the used method.

Author(s)

Michael Hahsler

See Also

Unsupervised discretization from arules: arules::discretize(), arules::discretizeDF().

Details about the available supervised discretization methods from discretization: discretization::mdlp, discretization::caim, discretization::cacc, discretization::ameva, discretization::chi2, discretization::chiM, discretization::extendChi2, discretization::modChi2.

Other preparation: CBA_ruleset(), mineCARs(), prepareTransactions(), transactions2DF()

Examples

data("iris")
summary(iris)

# supervised discretization using Species
iris.disc <- discretizeDF.supervised(Species ~ ., iris)
summary(iris.disc)

attributes(iris.disc$Sepal.Length)

# discretize the first few instances of iris using the same breaks as iris.disc
discretizeDF(head(iris), methods = iris.disc)

# only discretize predictors Sepal.Length and Petal.Length
iris.disc2 <- discretizeDF.supervised(Species ~ Sepal.Length + Petal.Length, iris)
head(iris.disc2)

[Package arulesCBA version 1.2.7 Index]