Font Size: a A A

Research On Fast Algorithms For Frequent Itemsets Mining Based On Compressed FP-tree

Posted on:2016-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:Q WuFull Text:PDF
GTID:2298330467977393Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The phenomenon of "dataexplosion, information poor"in the Internet age has made new requirements for data analysis. The purpose of data mining is to extract the hidden information beneath a number of seemingly chaotic data and to summarize the inherent law of the study object. Frequent itemsets mining, being the most basic and crucial part of association rules mining, has beenthe research hotspot in recent years.Considering the problems of candidates’generation-test algorithm and pattern growth algorithm, a modified searching algorithmnamed MCFP-tree is proposed. A more compact data structure is introduced by MCFP-tree, combined candidate generation mechanism with Apriori algorithm. Counting the support of candidates through searching the loacal compressed frequent pattern tree can improve the efficiency of mining frequent itemsets. Experimental results show that MCFP-tree algorithm can complete the fast searching for less longer frequent itemsets in the database.An comparative analysis is performed between CT-PRO algorithm and MCFP-tree algorithm, which are both introduced on the basis of compressed frequent pattern tree. Afterwards, the complexity evaluation criterion based on compressed frequent pattern tree is proposed,considering the fact that a variedrun performance happens when mining the same compressed frequent pattern tree with different searching algorithm. According to the explicit definition of complexity, the structure of compressed frequent pattern tree can be divided into simple tree structure, less complex tree structure and complex tree structure.Based on the complexity criterion came up,animproved searching algorithm is put forward based on the complexity criterion of compressed frequent pattern tree. MCFP-tree algorithm, CT-PRO algorithm and children tree mining method is adopted respectively for simple tree structure, less complex tree structure and complex tree structure. Experiments show that choosinga targeted search algorithm when facing a specified tree structure can significantly improve the efficiency of mining frequent itemsets.
Keywords/Search Tags:Data Mining, Association Rules Mining, Frequent Itemsets, MCFP-tree, Complexity Criterion
PDF Full Text Request
Related items