discrNumeric {arc}R Documentation

Discretize Numeric Columns In Data frame

Description

Can discretize both predictor columns in data frame – using supervised algorithm MDLP (Fayyad & Irani, 1993) – and the target class – using unsupervised algorithm (k-Means). This R file contains fragments of code from the GPL-licensed R discretization package by HyunJi Kim.

Usage

discrNumeric(
  df,
  classatt,
  min_distinct_values = 3,
  unsupervised_bins = 3,
  discretize_class = FALSE
)

Arguments

df

a data frame with data.

classatt

name the class attribute in df

min_distinct_values

the minimum number of unique values a column needs to have to be subject to supervised discretization.

unsupervised_bins

number of target bins for discretizing the class attribute. Ignored when the class attribute is not numeric or when discretize_class is set to FALSE.

discretize_class

logical value indicating whether the class attribute should be discretized. Ignored when the class attribute is not numeric.

Value

list with two slots: $cutp with cutpoints and $Disc.data with discretization results

References

Fayyad, U. M. and Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning, Artificial intelligence 13, 1022–1027

Examples

  discrNumeric(datasets::iris, "Species")


[Package arc version 1.3 Index]