Font Size: a A A

Research Frequent Pattern Mining Algorithm Based On Compact Pattern Tree And Multiple Minimum Support

Posted on:2020-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:E C WeiFull Text:PDF
GTID:2428330596479602Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The most typical characteristic of big data era is "data explosion,information shortage",so higher requirements are put forward for data analysis and mining.Data mining aims to extract the knowledge and information hidden behind the massive and disordered data,and to summarize the internal rules implied in it.As the most basic and critical step in the data mining process,frequent pattern mining has always been one of the most popular research fields.Many scholars have conducted indepth research on it,but there are still many problems to be solved and improved.In this paper,the frequent pattern mining algorithm is improved on the basis of single support and multiple support frequent pattern mining algorithm.The specific research contents and results are as follows:(1)An Apriori frequent pattern mining algorithm ICP-tree based on improved compact pattern tree is proposed.Firstly,join preprocessing operation is added before the connection step of Apriori algorithm to control the number of frequent itemsets participating in in the self-join and reduce the number of candidate itemsets generated.Secondly,the compact mode tree is extended to construct a new tree structure ECP-tree,which only needs to scan the database once and can effectively deal with the data flow problem.Then,the improvement point will be combined with the APFT algorithm for mining frequent patterns.Finally,ICP-tree algorithm is compared with Apriori algorithm,FP-growth algorithm,APFT algorithm and the algorithm proposed in literature 60 through experiments on two different types of datasets.The experimental results verify the effectiveness of the ICP-tree algorithm.(2)An improved multi-minimum support frequent pattern mining algorithm IMISFP-growth is proposed.First,preprocessing the items in the transaction database before constructing the tree,deleting those items whose support is less than the minimum item support,and constructing multiple support trees using the remaining frequent items.Then,a new method of constructing multiple item tree based on intersection rules is proposed.This method no longer uses a specific standard arrangement item to generate tree,but constructs a tree by the principle of intersection every time a new transaction item set is input.Finally,the IMISFP-growth algorithm is compared with the CFP-growth++ algorithm on five different datasets.The experimental results show that the improved algorithm is superior to the CFP-growth++algorithm in terms of running time,memory consumption and scalability.
Keywords/Search Tags:frequent patterns, join preprocessing, ECP-tree, intersection rule, multiple item support tree
PDF Full Text Request
Related items