opusminer-package {opusminer}R Documentation

Filtered Top-k Association Discovery of Self-Sufficient Itemsets

Description

The opusminer package provides an R interface to the OPUS Miner algorithm (implemented in C++), developed by Professor Geoffrey I Webb, for finding the top k, non-redundant itemsets on the measure of interest.

Details

OPUS Miner is a branch-and-bound algorithm for efficient discovery of self-sufficient itemsets. For a user-specified k and interest measure, OPUS Miner finds the top k productive non-redundant itemsets with respect to the specified measure. It is then straightforward to filter out those that are not independently productive with respect to that set, resulting in a set of self-sufficient itemsets.

OPUS Miner is based on the OPUS search algorithm. OPUS is a set enumeration algorithm distinguished by a computationally efficient pruning mechanism that ensures that whenever an item is pruned, it is removed from the entire search space below the parent node.

OPUS Miner systematically traverses viable regions of the search space (using depth-first search), maintaining a collection of the top k productive non-redundant itemsets in the search space explored. When all of the viable regions have been explored, the top k productive non-redundant itemsets in the search space explored must be the top k for the entire search space.

A comprehensive explanation of the algorithm is provided in the article cited below.

References

Webb, G. I., & Vreeken, J. (2014). Efficient Discovery of the Most Interesting Associations. ACM Transactions on Knowledge Discovery from Data, 8(3), 1-15. doi: http://dx.doi.org/10.1145/2601433


[Package opusminer version 0.1-1 Index]