@savinov

An Efficient Algorithm for Mining Interesting Set-Valued Rules

. Computer Science Journal of Moldova, 9 (2): 231--258 (2001)

Abstract

We describe the problem of mining set valued rules in large relational tables containing categorical attributes taking a finite number of values. An example of such a rule might be 'IF HOUSEHOLDSIZE = Two OR Tree AND OCCUPATION=Professional OR Clerical THEN PAYMENT_METHOD = CashCheck (Max=249, Sum=4952) OR DebitCard (Max=175, Sum=3021) WHERE Confidence=85%, Support=10%.' Such rules allow for an interval of possible values to be selected for each attribute in condition instead of a single value for association rules, while conclusion contains a projection of the data restricted by the condition onto a target attribute. An original conceptional and formal framework for representing multidimensional distributions induced from data is used. The distribution is represented by a number of so-called prime disjunctions upper bounding its surface and interpreted as a wide multidimensional interval of impossible combinations of attribute values. This original formalism generalises the conventional boolean approach in two directions: (i) finite-valued attributes (instead of only 0 and 1), and (ii) continuous-valued semantics (instead of true and false). In addition, we describe an efficient algorithm, which carries out the generalised dual transformation from possibilistic disjunctive normal form (DNF) representing data into conjunctive normal form (CNF) representing knowledge.

Links and resources

Tags

community

  • @savinov
  • @dblp
@savinov's tags highlighted