dig_implications {nuggets}R Documentation

Search for implicative rules

Description

Implicative rule is a rule of the form A \Rightarrow c, where A (antecedent) is a set of predicates and c (consequent) is a predicate.

Usage

dig_implications(
  x,
  antecedent = everything(),
  consequent = everything(),
  disjoint = NULL,
  min_length = 0L,
  max_length = Inf,
  min_coverage = 0,
  min_support = 0,
  min_confidence = 0,
  t_norm = "goguen",
  ...
)

Arguments

x

a matrix or data frame with data to search in. The matrix must be numeric (double) or logical. If x is a data frame then each column must be either numeric (double) or logical.

antecedent

a tidyselect expression (see tidyselect syntax) specifying the columns to use in the antecedent (left) part of the rules

consequent

a tidyselect expression (see tidyselect syntax) specifying the columns to use in the consequent (right) part of the rules

disjoint

an atomic vector of size equal to the number of columns of x that specifies the groups of predicates: if some elements of the disjoint vector are equal, then the corresponding columns of x will NOT be present together in a single condition.

min_length

the minimum length, i.e., the minimum number of predicates in the antecedent, of a rule to be generated. Value must be greater or equal to 0. If 0, rules with empty antecedent are generated in the first place.

max_length

The maximum length, i.e., the maximum number of predicates in the antecedent, of a rule to be generated. If equal to Inf, the maximum length is limited only by the number of available predicates.

min_coverage

the minimum coverage of a rule in the dataset x. (See Description for the definition of coverage.)

min_support

the minimum support of a rule in the dataset x. (See Description for the definition of support.)

min_confidence

the minimum confidence of a rule in the dataset x. (See Description for the definition of confidence.)

t_norm

a t-norm used to compute conjunction of weights. It must be one of "goedel" (minimum t-norm), "goguen" (product t-norm), or "lukas" (Lukasiewicz t-norm).

...

Further arguments, currently unused.

Details

For the following explanations we need a mathematical function supp(I), which is defined for a set I of predicates as a relative frequency of rows satisfying all predicates from I. For logical data, supp(I) equals to the relative frequency of rows, for which all predicates i_1, i_2, \ldots, i_n from I are TRUE. For numerical (double) input, supp(I) is computed as the mean (over all rows) of truth degrees of the formula ⁠i_1 AND i_2 AND ... AND i_n⁠, where AND is a triangular norm selected by the t_norm argument.

Implicative rules are characterized with the following quality measures.

Length of a rule is the number of elements in the antecedent.

Coverage of a rule is equal to supp(A).

Support of a rule is equal to supp(A \cup \{c\}.

Confidence of a rule is the fraction supp(A) / supp(A \cup \{c\}).

Value

A tibble with found rules and computed quality measures.

Author(s)

Michal Burda

See Also

dig()


[Package nuggets version 1.0.2 Index]