| SDTaskConfig-class {rsubgroup} | R Documentation |
Class “SDTaskConfig” — A Set of Configuration Settings
Description
A Set of Configuration Settings for the Subgroup and Pattern Mining Algorithms
Objects from the Class
Objects are created by calls of the form
SDTaskConfig(...).
Slots
attributes:The list of attributes to consider for mining. Either a vector of attribute names, or NULL (the default), which includes all attributes.
discretize:Boolean, indicating whether to (automatically) discretize numeric attributes (default
discretize=TRUE. Depends on parameter nbins. Either creates distinct values, if their number in the dataset is <= nbins, or applies equal-frequency discretization for the respective numeric attribute.method:A mining method; one of Beam-Search
beam, BSDbsd, SD-Mapsdmap, SD-Map enabling internal disjunctionssdmap-dis. The default ismethod = "sdmap".nbins:Specifies the number of bins to be used when discretizing numeric attributes (see
discretizeabove).qf:A quality function; one of: Adjusted Residuals
ares, Binomial Testbin, Chi-Square Testchi2, Gaingain, Liftlift, Piatetsky-Shapirops, Relative Gainrelgain, Weighted Relative Accuracywracc. The default isqf = "ps".k:The maximum number (top-k) of patterns to discover, i.e., the best k rules according to the selected quality function. The default is
k = 20minqual:The minimal quality (default
minqual = 0).minsize:The minimal size of a subgroup (as an integer) (minimal coverage of database records, default
minsize = 0).mintp:The minimal true positive (tp) threshold, an integer (minimal (absolute) number of true positives in a subgroup, relevant for binary target concepts only), defaults to
mintp = 0
.
maxlen:The maximal length of a description of a pattern, i.e., the maximal number of conjunctions. This impacts both understandability and efficiency. Simpler rules are easier to understand, and a small
maxlenwill restrict the search space (defaultmaxlen = 7).nodefaults:Ignore default values, i.e., do not include the respective first value (with index 0) of each attribute (default
nodefaults=FALSE, i.e., include all values).relfilter:Controls, whether irrelevant patterns are filtered during pattern mining; negatively impacts performance (default
relfilter = FALSE)).postfilter:Controls, whether a post-processing filter is applied; one (or a vector) of: Minimum Improvement (Global)
min-improve-global, checks the patterns against all possible generalizations, Minimum Improvement (Pattern Set)min-improve-set, checks the patterns against all their generalizations in the result set, Relevancy Filterrelevancy, removes patterns that are strictly irrelevant, Significant Improvement (Global)sig-improve-global, removes patterns that do not significantly improve (default 0.01 level) w.r.t. all their possible generalizations, Significant Improvement (Set)sig-improve-set, removes patterns that do not significantly improve (default 0.01 level) w.r.t. all generalizations in the result set, Weighted Coveringweighted-covering, performs weighted covering on the data in order to select a covering set of subgroups while reducing the overlap on the data. By default no postfilter is set, i.e.,postfilter = "".parfilter:Provides the minimal improvement value for the postfilter (for min-improve-* filters), or the significance level (P) for sig-improve-* filters.
See Also
DiscoverSubgroups.
DiscoverSubgroupsByTask
CreateSDTask