Font Size: a A A

Research On Association Rule Algorithm Mining

Posted on:2016-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:C Y DuanFull Text:PDF
GTID:2308330461460919Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Association rules mining is an important research field in data mining, with the rapid development of information technology and Internet of things technology, the data of corresponding applications grow rapidly, as well as the data content becoming more and more comprehensive, meanwhile, various industries pay more attention to data mining analysis,which make problems of the association rules mining efficiency especially; and various industries hope to mine and analysis data of the current access timely, it also makes the data stream mining research more and more important.For the problem that low efficiency of frequent patterns mining in static data and data flow mining, this article makes a series of analysis and research around the association rules.Firstly, simply outlines the basic knowledge of data mining technology and association rules mining, mainly includes the definition, classification, common technology of data mining technique and data mining process; then, analyses the Apriori algorithm and FP-Growth algorithm based on association rules in detail; finally, based on the exploration and analysis of typical mining algorithms and the latest research results, proposes the improved algorithm for mining frequent patterns:(1) In data mining of static data, an improved VMOApriori algorithm based on weighted matrix is proposed. For the inadequacy of big I/O load, low speed of calculation and redundant itemsets in Apriori algorithm, algorithm scans the database to generate the transaction matrix and operates with weighted matrix and vector to calculate the frequent itemsets; algorithm compresses matrix to reduce the redundancy candidate sets. The simulation results show that the algorithm reduces the I/O load and the amount of the intermediate results, improves the efficiency of data mining.(2) In data mining of data flow, a parallel frequent pattern mining algorithm based on distributed sliding windows is proposed. For the inadequacy of low efficiency in frequent pattern mining of data streams, according to the Hadoop cloud computing platform, it applies sliding windows to Map/Reduce model, constructs and mines TPT-Tree in distributed nodes to reduce the mining time in sliding windows, processes intermediate candidate sets with hash structure, accelerates the speed of mining frequent itemsets, only needs Map/Reduce model once which makes full use of its large storage and calculation. Finally, experiments on multiple datasets validate the method and show that it is effective with a satisfied speedup.
Keywords/Search Tags:Association Rule, Frequent Itemsets, VMOApriori Algorithm, Sliding Window, Map/Reduce Model
PDF Full Text Request
Related items