Font Size: a A A

The Research Of Association Rules Algorithm Based On Frequent Pattern Tree

Posted on:2009-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:H L WangFull Text:PDF
GTID:2178360272480468Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data Mining is an information processing technology, which is developing very fast in recent years. Using data mining, people can abstract information and knowledge from a great deal of data which is incomplete, noisy, dark and random. The information and knowledge we got was ignored and had not been known before but potentially useful. Data mining involves an integration of techniques from database, artificial intelligence, machine learning, pattern recognition, knowledge engineering, object-oriented method, information retrieval, high performance computing and visualizationAssociation rule mining is an important sub-branch of the Data Mining. Its role finds out strong rules if they satisfy both a minimum support threshold and a minimum confidence threshold. Association rule mining algorithms are the core contents in the area. So far, there are several famous typical algorithms.This article researched in detail classical Apriori and AprioriTid algorithm and FP-Growth algorithm which does not produce the candidates of the Frequent Pattern, and these algorithm's instances provided to analyze two algorithms. FP-Growth algorithm scanned database only two times and avoided to produce large candidates of the frequent pattern, it's more effective than Apriori algorithm. But its spatial overhead is large that is the one of main bottleneck of FP-Growth Algorithm. In order to save space and improve the discovery efficiency of Frequent Items, the traditional Frequent Pattern Tree and Item Header Table are optimized. The method to construct the item header table by formeding dynamically hash chain address is adopted. Each node of FP-Tree only store it's address in the item header table in order to avoid producing the null pointer on the address and save the expense of the memory space. In the meantime, the number of the node's field is increased which comes through the convenient bidirectional traversal. In addition to diving the transactions datasets into many subsets according to certain rules, and then, carrys out frequent itemsets mining for each subset. As a result, the problem which memory can't load Frequent Pattern Tree is resovled and data mining proceeds successfully.Finally, experiments have been conducted to compare the optimized association rules mining algorithms based on Frequent Pattern Tree more effective than the traditional FP-Growth algorithm when they are mining large datasets.
Keywords/Search Tags:Data Mining, Association Rule, Frequent Pattern Tree, Frequent Pattern Growth
PDF Full Text Request
Related items