Font Size: a A A

Research On Top-K Frequent Itemsets Datamining Algorithm

Posted on:2016-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:H CuiFull Text:PDF
GTID:2308330464967970Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data mining is the study found that the data from a large number of theories and methods of useful knowledge, it is one of the frontier research direction in the field of database and information decision. Data mining association rules is an earlier, meaningful research topics.In the mining process of association rules, frequent itemsets mining is the foundation of the whole data mining process, but also the core of the whole mining, how efficient and effective mining frequent item sets is always a hot research focus. However, in practical applications, due to the presence of frequent itemsets and big data quantity huge, thus impeding the wide application of frequent item sets. Therefore, how to optimize and frequent itemsets algorithm of frequent itemsets compression has become an important direction of current research.This paper first introduces the background of data mining and the current research situation at home and abroad, then simply introduces the basic mining frequent itemsets and association rules, At the same time, a brief analysis of the frequent itemsets and frequent itemsets compression techniques commonly used compression method comparison. Finally, this paper puts forward the strategy of greedy Top-K algorithm of frequent itemsets mining algorithm and NFIMG algorithm derived from NFIMG with closed nature of the Top-K node pruning algorithm for mining closed frequent itemsets based on NCFIMG.(1) This paper presents the NFIMG algorithm. The algorithm is based on the generation of frequent link greedy strategy, abandoned artificial for minimum support intervention, Only one traversal database operations, the use of generating the Top-K mining method for mining frequent itemsets. The use of generating the Top-K mining method for mining frequent itemsets only one traversal database operations,(2) This paper presents the NCFIMG algorithm for mining Top-K frequent closed itemsets. This paper presents the NFIMG algorithm based on the algorithm of nature, At the same time, with the nature of closed itemsets mining, process on the basis of the "closed node" lemma for pruning operation. After that, this paper proves the correctness of the algorithm in theory, the experimental results show that the algorithm in time and space superiority. At the same time, the algorithm thinking, easy to realize, NFIMG algorithm for mining type conversion in the process of mining.This paper has carried on the contrast experiment wide of the proposed algorithm. Are generated in the UCI machine learning of multiple data sets in the library and IBM data generator on a data set were contrast test. The experimental results show that, with Apriori, NApriori algorithm, NFIMG algorithm, this paper proposed will be slightly inferior in the space complexity and time complexity. At the same time, the improved NCFIMG algorithm in comparison with TFP algorithm in mining efficiency and storage space in the advantage is very obvious. The experimental results show that the NCFIMG algorithm proposed in this paper is more in mining long set when the efficiency. These research results provide an effective way to solve the problem of frequent item sets in the practical application.
Keywords/Search Tags:Association rules, Frequent itemsets, Top-K, frequent itemsets, Closed frequent itemsets
PDF Full Text Request
Related items