Font Size: a A A

The Study Of Maximum Frequent Itemsets Algorithm Based On Frequent Pattern Tree

Posted on:2017-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z H YinFull Text:PDF
GTID:2308330503982187Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology, data mining has attracted more and more attention. Then the association rules has been a hot topic in the field, which aims at discovering the relationship among the data and many other interesting patterns. As an important research part of association rules, the mining maximum frequent itemsets not only has covered all the frequent itemsets, but sometimes acts as the only mining of some data mining applications. As a result, the mining maximum frequent itemsets has very important significance. For the maximum frequent itemsets mining algorithm, this paper studies the following three aspects, respectively, the research of the reducing dimensions of candidate sets, the research of the superset checking method, and the research of the incremental update algorithm.Firstly, in order to solve the problems of the candidate sets initial value having higher dimension in the Discover Maximum Frequent Itemsets Algorithm(DMFIA) and the mining shorter maximum frequent itemsets being inefficient, this paper puts forward an improved maximum frequent itemsets mining algorithm, namely, FP-EMFIA, which bases on Frequent Pattern Tree(FP-Tree). The algorithm adopts the bidirectional search strategy of up-down and down-up, and analyzes the items’ counting characteristics in the conditional pattern base for filtering the items, necessarily being included or not included in maximum frequent itemsets, so as to achieve the dimension’s decrease. Moreover, the algorithm uses the having mined shorter infrequent itemsets to prune for candidate sets,thus improving the algorithm’s efficiency.Secondly, aiming at solving the problem of superset checking number being too large,this paper proposes a Superset Checking Algorithm Based on Index List(IL-SC). This algorithm adopts the storage structure of index list-table to make the maximum frequent itemsets storage becoming ordering, further, reduce unnecessary checking operations, thus improving the superset checking efficiency.Finally, for making the most of the initial having mined results, this paper comes up with the Update Maximum Frequent Itemsets Algorithm Based on FP-EMFIA(FP-EUMFIA), which is able to reduce the dimensions of the initial candidate sets. As aresult, FP-EUMFIA can achieve higher efficiency.
Keywords/Search Tags:frequent pattern tree, maximum frequent itemsets, superset checking, incremental updating
PDF Full Text Request
Related items