opus {opusminer} | R Documentation |
Filtered Top-k Association Discovery of Self-Sufficient Itemsets
Description
opus
finds the top k productive, non-redundant itemsets on the
measure of interest (leverage or lift) using the OPUS Miner algorithm.
Usage
opus(transactions, k = 100, format = "data.frame", sep = " ",
print_closures = FALSE, filter_itemsets = TRUE, search_by_lift = FALSE,
correct_for_mult_compare = TRUE, redundancy_tests = TRUE)
Arguments
transactions |
A filename, list, or object of class
|
k |
The number of itemsets to return, an integer (default 100). |
format |
The output format ("data.frame", default, or "itemsets"). |
sep |
The separator between items (for files, default " "). |
print_closures |
return the closure for each itemset (default |
filter_itemsets |
filter itemsets that are not independently productive (default |
search_by_lift |
make lift (rather than leverage) the measure of interest (default |
correct_for_mult_compare |
correct alpha for the size of the search space (default |
redundancy_tests |
exclude redundant itemsets (default |
Details
opus
provides an interface to the OPUS Miner algorithm (implemented in
C++) to find the top k productive, non-redundant itemsets by leverage
(default) or lift.
transactions
should be a filename, list (of transactions, each list
element being a vector of character values representing item labels), or an
object of class transactions
(arules
).
Files should be in the format of a list of transactions, one line per
transaction, each transaction (ie, line) being a sequence of item labels,
separated by the character specified by the parameter sep
(default "
"). See, for example, the files at http://fimi.ua.ac.be/data/.
(Alternatively, files can be read seaparately using the
read_transactions
function.)
format
should be specified as either "data.frame" (the default) or
"itemsets", and any other value will return a list.
Value
The top k productive, non-redundant itemsets, with relevant
statistics, in the form of a data frame, object of class
itemsets
(arules
), or a list.
References
Webb, G. I., & Vreeken, J. (2014). Efficient Discovery of the Most Interesting Associations. ACM Transactions on Knowledge Discovery from Data, 8(3), 1-15. doi: http://dx.doi.org/10.1145/2601433
Examples
## Not run:
result <- opus("mushroom.dat")
result <- opus("mushroom.dat", k = 50)
result[result$self_sufficient, ]
result[order(result$count, decreasing = TRUE), ]
trans <- read_transactions("mushroom.dat", format = "transactions")
result <- opus(trans, print_closures = TRUE)
result <- opus(trans, format = "itemsets")
## End(Not run)