R: Class "SDTaskConfig" - A Set of Configuration Settings

SDTaskConfig-class {rsubgroup}

R Documentation

Class “SDTaskConfig” — A Set of Configuration Settings

Description

A Set of Configuration Settings for the Subgroup and Pattern Mining Algorithms

Objects from the Class

Objects are created by calls of the form SDTaskConfig(...).

Slots

attributes:: The list of attributes to consider for mining. Either a vector of attribute names, or NULL (the default), which includes all attributes.
discretize:: Boolean, indicating whether to (automatically) discretize numeric attributes (default discretize=TRUE. Depends on parameter nbins. Either creates distinct values, if their number in the dataset is <= nbins, or applies equal-frequency discretization for the respective numeric attribute.
method:: A mining method; one of Beam-Search beam, BSD bsd, SD-Map sdmap, SD-Map enabling internal disjunctions sdmap-dis. The default is method = "sdmap".
nbins:: Specifies the number of bins to be used when discretizing numeric attributes (see discretize above).
qf:: A quality function; one of: Adjusted Residuals ares, Binomial Test bin, Chi-Square Test chi2, Gain gain, Lift lift, Piatetsky-Shapiro ps, Relative Gain relgain, Weighted Relative Accuracy wracc. The default is qf = "ps".
k:: The maximum number (top-k) of patterns to discover, i.e., the best k rules according to the selected quality function. The default is k = 20
minqual:: The minimal quality (default minqual = 0).
minsize:: The minimal size of a subgroup (as an integer) (minimal coverage of database records, default minsize = 0).
mintp:: The minimal true positive (tp) threshold, an integer (minimal (absolute) number of true positives in a subgroup, relevant for binary target concepts only), defaults to mintp = 0

maxlen:: The maximal length of a description of a pattern, i.e., the maximal number of conjunctions. This impacts both understandability and efficiency. Simpler rules are easier to understand, and a small maxlen will restrict the search space (default maxlen = 7).
nodefaults:: Ignore default values, i.e., do not include the respective first value (with index 0) of each attribute (default nodefaults=FALSE, i.e., include all values).
relfilter:: Controls, whether irrelevant patterns are filtered during pattern mining; negatively impacts performance (default relfilter = FALSE)).
postfilter:: Controls, whether a post-processing filter is applied; one (or a vector) of: Minimum Improvement (Global) min-improve-global, checks the patterns against all possible generalizations, Minimum Improvement (Pattern Set) min-improve-set, checks the patterns against all their generalizations in the result set, Relevancy Filter relevancy, removes patterns that are strictly irrelevant, Significant Improvement (Global) sig-improve-global, removes patterns that do not significantly improve (default 0.01 level) w.r.t. all their possible generalizations, Significant Improvement (Set) sig-improve-set, removes patterns that do not significantly improve (default 0.01 level) w.r.t. all generalizations in the result set, Weighted Covering weighted-covering, performs weighted covering on the data in order to select a covering set of subgroups while reducing the overlap on the data. By default no postfilter is set, i.e., postfilter = "".
parfilter:: Provides the minimal improvement value for the postfilter (for min-improve-* filters), or the significance level (P) for sig-improve-* filters.

Class “SDTaskConfig” — A Set of Configuration Settings

Description

Objects from the Class

Slots

See Also