Font Size: a A A

Research On Tire-Based Algorithm For Minging Association Rules

Posted on:2012-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:C QinFull Text:PDF
GTID:2178330332998182Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Mining Association Rules is an important issue in the Data Mining field, its success in the field of commercial applications makes it become one of the most mature and important research in data mining. To discover association rules, first need to mining frequent itemsets, and the Tire which is a prefix tree is a appropriate data strcuture to store frequent itemsets. To speed up the efficiency of transaction database access services, and with the continuous development of computer hardware, the capacity of internal memory in a computer is rising, compressing the original database into the memory can be considered. Besides, prune the candidate frequent patterns have always been a important method to improve association rules mining algorithms. Research on association rules mining with Tire from these two aspects will be of very practical significance on the current or future study and has a broad prospect.Firstly, this paper introduces the the basic concepts of association rules and the general classification of association rules mining algorithms, the breadth-first search and the depth-first search. We also introduces the classic algorithms from each classification, includeing Apriori and FP-growth, interprets their data structure and algorithm descriptions with a specific transaction database as an example, their advantages and disadvantages have been analyzed and compared.Secondly, after interpreting the tire data structure and how to store the original transaction database into a two-dimensional compressed array, this paper proposes a association rule mining algorithm based on compressed database and dynamic pruning CDDP, the algorithm refers to the frequent itemsets mining algorithm DF, which used a Tire to store frequent itemsets, but CDDP utilize the the Tire's property to dynamic pruning during Tire's construction,which hugely reduced the number of candidate frequent itemsets. In association rules minging step, this paper proposes a algorithm to find frequent patterns in a Tire. Some details of the algorithm's implementation has also been discussed, the result of the algorithm's complexity analysis suggest that the method has a favorable time and space efficiency in the datasets which contain lots of long patterns.Finally, after introduces IBM Quest Market-Basket Synthetic Data Generator, we select two synthetic datasets and two real world datasets for experiment, DF and Apriori as comparison algorithms. Experimental results show that different algorithms have their own good at data types. Compared with DF, CDDP which use dynamic pruning is more adept at handling datasets with less frequent item and more long-patterns, and DF which use common pruning is more adept at dealing with datasets with more frequent item and less long-patterns. And compared with Apriori, CDDP have a significant superiority in any type of datasets.
Keywords/Search Tags:Association Rules, Frequent Pattern, Database Compressing, Dynamic Pruning, Tire
PDF Full Text Request
Related items