support {arules} | R Documentation |
Support Counting for Itemsets
Description
Provides the generic function support()
and the methods to count support for
given itemMatrix and associations in a given transactions
data.
Usage
support(x, transactions, ...)
## S4 method for signature 'itemMatrix'
support(
x,
transactions,
type = c("relative", "absolute"),
method = c("ptree", "tidlists"),
reduce = FALSE,
weighted = FALSE,
verbose = FALSE,
...
)
## S4 method for signature 'associations'
support(
x,
transactions,
type = c("relative", "absolute"),
method = c("ptree", "tidlists"),
reduce = FALSE,
weighted = FALSE,
verbose = FALSE,
...
)
Arguments
x |
the set of itemsets for which support should be counted. |
transactions |
the transaction data set used for mining. |
... |
further arguments. |
type |
a character string specifying if |
method |
use |
reduce |
should unused items are removed before counting? |
weighted |
should support be weighted by transactions weights stored as
column |
verbose |
report progress? |
Details
Normally, the support of frequent itemsets is very efficiently counted during
mining process using a set minimum support.
However, if only the support for specific itemsets (maybe itemsets with very low support)
is needed, or the support of a set of itemsets needs to be recalculated on
different transactions than they were mined on, then support()
can be used.
Several methods for support counting are available:
-
"ptree"
(default method): The counters for the itemsets are organized in a prefix tree. The transactions are sequentially processed and the corresponding counters in the prefix tree are incremented (see Hahsler et al, 2008). This method is used by default since it is typically significantly faster than transaction ID list intersection. -
"tidlists"
: support is counted using transaction ID list intersection which is used by several fast mining algorithms (e.g., by Eclat). However, Support is determined for each itemset individually which is slow for a large number of long itemsets in dense data.
To speed up counting, reduce = TRUE
can be specified in control. Unused items
are removed from the transactions before counting.
Value
A numeric vector of the same length as x
containing the
support values for the sets in x
.
Author(s)
Michael Hahsler and Christian Buchta
References
Michael Hahsler, Christian Buchta, and Kurt Hornik. Selective association rule generation. Computational Statistics, 23(2):303-315, April 2008.
See Also
Other interest measures:
confint()
,
coverage()
,
interestMeasure()
,
is.redundant()
,
is.significant()
Examples
data("Income")
## find and some frequent itemsets
itemsets <- eclat(Income)[1:5]
## inspect the support returned by eclat
inspect(itemsets)
## count support in the database
support(items(itemsets), Income)