Font Size: a A A

Research On The Technologies Of Association Rules

Posted on:2008-07-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:B ShenFull Text:PDF
GTID:1118360242972945Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays, in the prosperous background of data mining technology, association rules (ARs) technology obtains the vigorous development. Mining association rules aims at finding interesting correlations and associations from big volumes of data. Its application scope expands from the narrow-sense market basket analysis to the design and optimization of website, the network intrusion detection, the traffic accident pattern analysis, the analysis of medicine ingredient, the protein structure analysis, software bug mining, and fault diagnosis for machine and so on. Its fundamental research contents also expend from the original frequent pattern mining to the close pattern mining, maximal pattern mining, extension association rule mining, privacy protection in ARs, incremental mining for ARs, post-mining process, subjective interesting measures, correlated patterns, and ARs mining from data streams et al. Therefore, it is necessary to have an in-depth study for related technologies of association rules.Aiming at a few shortcomings of current ARs technology, we propose the corresponding solutions, and achieve a certain innovative contributions. The main contents of the dissertation are itemized as follows:(1) Proposing a new kind of interestingness measure of correlation named All-item-confidence. This measure has perfect characters for measuring item-item correlation of pattern, for example: the proper upper and lower bounds, a good anti-monotone and so on. The relationship between All-item-confidence and All-set-confidence, and the application scope of this measure are also discussed.(2) For the sake of improving the lack of common ARs in solving symmetric between-set applications, problem of mining between-item correlated ARs is presented. Firstly, All-confidence measure and All-item-confidence measure are adopted to mine associated and between-item correlated frequent patterns, and then between-item correlated ARs can be gotten. Related definitions, descriptions and instance of between-item correlated ARs are given. At last, two mining algorithms named ItemCoMine_AP and ItemCoMine_CT are also proposed, and the performance of these two algorithms are tested. (3) In order to solving the shortcomings of common ARs for asymmetric between-set applications, problem of mining between-item and between-set correlated ARs are proposed. After obtaining associated and between-item correlated frequent patterns, between-set correlated measure is used for achieving between-item and between-set correlated ARs. Related definitions, descriptions, instance and mining algorithms named I&ISCoMine_AP and I&ISCoMine_CT are also discussed.(4) Presenting the new definition of dynamic ARs. And then, two dynamic ARs mining algorithms are given. One is improved two-stage mining ITS algorithm, and another is extended FP-tree based EFP-Growth algorithm. We also test these two algorithms performance.(5) Putting forward a new problem of mining dynamic ARs with comments (DAR-C). At first, we give the expression method for candidate effective time-periods, and then define the concept of DAR-C. Then the corresponding algorithms which are ITS2 and EFP-Growth2 are presented. DAR-C has a good description function for dynamic and skewed database.(6) Focusing on the study of mining weighted generalized fuzzy association rules with fuzzy taxonomies (WGF-ARs). In order to reflect the importance of different items, the dissertation first defines the notion of generalized weight. Then the basic theory about WGF-AR is proposed, which includes the definitions of weighted support, weighted confidence and so on. And then, an illustration is given to explain the corresponding concepts and computation process. In the following, we prove that the downward closure character comes into exist in WGF-AR model. And we also put forward its mining algorithms, which are W-Apriori and WCT-PRO.(7) Proposing a new method for association rules clustering based on fuzzy taxonomy with semantic information. The combination method and establishing method for fuzzy taxonomy with semantic information are brought out. Following we use fuzzy taxonomy with semantic information to compute the distance between different rules. A series of methods for computing distance between different items, different item sets and different rules are proposed. Illustrations are also utilized to explain the ideas of the above methods. At last, we use clustering algorithm to cluster the rule set, and then display the clustering results with visualization tools.
Keywords/Search Tags:data mining, association rule, interestingness measure of correlation, pruning effect, dynamic association rule, weighted extension, clustering
PDF Full Text Request
Related items