Abstract
Self-sufficient itemsets have been proposed as an effective approach to summarizing the key associations
in data. However, their computation appears highly demanding, as assessing whether an itemset is selfsufficient
requires consideration of all pairwise partitions of the itemset into pairs of subsets as well as
consideration of all supersets. This paper presents the first published algorithm for efficiently discovering
self-sufficient itemsets. This branch-and-bound algorithm deploys two powerful pruning mechanisms
based on upper-bounds on itemset value and statistical significance level. It demonstrates that finding top-k
productive and non-redundant itemsets, with post processing to identify those that are not independently
productive, can efficiently identify small sets of key associations. We present extensive evaluation of the
strengths and limitations of the technique including comparisons with alternative approaches to finding the
most interesting associations.
Users
Please
log in to take part in the discussion (add own reviews or comments).