Font Size: a A A

Research On The Optimization Of Association Rules

Posted on:2008-04-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z HeFull Text:PDF
GTID:1118360212474145Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The analysis of association rules has drawn many researchers'attention since itwas proposed. It has been employed in many real-world fields. However, there are stillmany problems for it to resolve. Besides the e?ciency of finding frequent itemsets andassociation rules, there are two problems about the results of the algorithms for findingrules. The first is that the results consist of too many rules(the rule quantity problem).The second is that the ratio of the interesting rules to the whole rule set is too small(therule quality problem). In this paper, we emphasize on the second problem.The problem of discovering association rules consists of four elements: data set,the form of the rule, search algorithm, interestingness measure. In fact, the searchalgorithm depends on the other elements. So, we try to improve the quality of rule byoptimizing data set, simplifying the form of association rule and optimizing interest-ingness measure.The first part of our work is on optimizing interestingness measure. We adoptresidual analysis(RA) to test the dependence of two item(set)s. If they are corre-lated, mutual information(MI) measure is used to evaluate the strength of correlation.Based on these two measures, an algorithm is proposed to find the optimized posi-tively/negtively correlated rules. To avoid producing too many trivial rules, a geneticalgorithm is adapted to find those optimized rules. The procedure of optimizationalso completes the task of pruning. The long and complex rules are hard likely to beproduced by setting the fitness function.The second part of our work is on optimizing data set for find quantitative as-sociation rules(QAR). We propose two unsupervised and multivariate discretizationalgorithms(EMVD-BDC and OMVD). The task of discovering association rules is unsu-pervised. And the rules are supposed to re?ect the interreaction between the item(set)s.
Keywords/Search Tags:data mining, association rules, optimized association rules, interestingness measure, discretization, relative density, genetic algorithm, clustering
PDF Full Text Request
Related items