Font Size: a A A

Research On Expanded Models Of Association Rules

Posted on:2003-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:P C WuFull Text:PDF
GTID:2168360092455001Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, Know1edge Discovery in Databases (KDD) has becomean active research field, which can fifld its relevance to artificia1inte11igence, database system and other discip1ines. 0ne of the majorpatterns to be discovered in KDD endeavors is association ru1es, whosedi scovery prob1 em has been an important research direct ion in KDD fi e1d.This thesis is chief1y devoted to expanded mode1s and mining a1gorithmsof as soc i at ion ru l es.0ne of the basic assumptions in tradi tiona1 associat ion ru1es is thata11 items in the database are equa11y important. Thi s is, however, rare1ytrue in rea1--1 if e app1ications t users' interests may vary from item toitem, users' interests toward a certain item may vary from time to time.A11owing for this fact, two c1asses of expanded associat ion ru1e mode1 shave been proposed: weighted associat ion ru1es, and assoc iat ion ru1 eswith mu1tip1e minimum supports. In the first mode1, each item is 1abe1edwith a weight va1ue denoting the significance of the item, andtraditiona1 support definition is then extended to we1ghted support. Asa resu1t, the generated association ru1es are skewed toward ru1esconcerning items with greater weights (to users' greater interest).The second mode1, on the other hand, takes a different route. lt abandonsthe traditiona1 s ing1e minimuIn support thresho1d. In i ts p1ace, each itemls assoc1ated with a pre--specified minimum item support (MIS). By1owering the MIS thresho1ds for important items, the effect of the firstmode1 can a1so be achieved.The ma1n contribution of the thesis is that, on the bas is of negativeassociation ru1es, we deve1op the mode1 of hybrid associat ion rules.It has been observed that negative association ru1es, the oppositeof (tradit iona1) association ru1es, a1so revea1 readi 1y app1 icab1einformat ion. In KDD commun ity, however, negat ive assoc iat ion ru1es are1ess covered and not comp1ete1y forma1ized. In an effort to formu1atenegative ru1es e1egant1y, we introduce negat ive items into itemsets, andthereby defifle an expanded mode1 of association rules f hybri dassociation rules. This expansion makes hybrid association ru1es asuperset of both traditional and negative association rules.The thesis proposes and elaborates three approaches to hybrid association rule discovery: the direct approach; hash-tree-based approach; and tree-based approach. The first approach derives directly from the new problem setting to generate large itemsets. However, It suffers from great complexity and low efficiency. The hash-tree-based approach is a slight modification of the Apriori algorithm, which, again, we find not satisfying. To further improve the efficiency, we devise another approach based on a different yet similar tree structure. In addition, we also give a pruning algorithm to weed out redundant negative rules. All the algorithms are implemented in C. To assess the performances of the algorithms, we first apply theoretical distribution and use pseudo-random function to generate some synthetic datasets, on which experiments are then performed to find hybrid association rules. However, the model of hybrid association rules has its difficulties too. For instance, the problem of "too many rules" may be especially notorious in our model. As a future research direction, we may apply different thresholds (e.g. weights or multiple minimum supports) on items/itemsets of different nature.
Keywords/Search Tags:Knowledge Discovery in Databases, Association Rule, Itemset, Large Itemset, Negative Item, Negative Association Rule, Hybrid Association Rule
PDF Full Text Request
Related items