Research On Association Rule Algorithm Mining

Posted on:2016-07-23

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Duan

Full Text:PDF

GTID:2308330461460919

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Association rules mining is an important research field in data mining, with the rapid development of information technology and Internet of things technology, the data of corresponding applications grow rapidly, as well as the data content becoming more and more comprehensive, meanwhile, various industries pay more attention to data mining analysis,which make problems of the association rules mining efficiency especially; and various industries hope to mine and analysis data of the current access timely, it also makes the data stream mining research more and more important.For the problem that low efficiency of frequent patterns mining in static data and data flow mining, this article makes a series of analysis and research around the association rules.Firstly, simply outlines the basic knowledge of data mining technology and association rules mining, mainly includes the definition, classification, common technology of data mining technique and data mining process; then, analyses the Apriori algorithm and FP-Growth algorithm based on association rules in detail; finally, based on the exploration and analysis of typical mining algorithms and the latest research results, proposes the improved algorithm for mining frequent patterns:(1) In data mining of static data, an improved VMOApriori algorithm based on weighted matrix is proposed. For the inadequacy of big I/O load, low speed of calculation and redundant itemsets in Apriori algorithm, algorithm scans the database to generate the transaction matrix and operates with weighted matrix and vector to calculate the frequent itemsets; algorithm compresses matrix to reduce the redundancy candidate sets. The simulation results show that the algorithm reduces the I/O load and the amount of the intermediate results, improves the efficiency of data mining.(2) In data mining of data flow, a parallel frequent pattern mining algorithm based on distributed sliding windows is proposed. For the inadequacy of low efficiency in frequent pattern mining of data streams, according to the Hadoop cloud computing platform, it applies sliding windows to Map/Reduce model, constructs and mines TPT-Tree in distributed nodes to reduce the mining time in sliding windows, processes intermediate candidate sets with hash structure, accelerates the speed of mining frequent itemsets, only needs Map/Reduce model once which makes full use of its large storage and calculation. Finally, experiments on multiple datasets validate the method and show that it is effective with a satisfied speedup.

Keywords/Search Tags:

Association Rule, Frequent Itemsets, VMOApriori Algorithm, Sliding Window, Map/Reduce Model

PDF Full Text Request

Related items

1	Frequent Itemsets Mining Algorithm And Its Application In Data Flow
2	Research On Frequent Patterns Mining Algorithm Based Sliding Window In Data Streams
3	Research On Optimization Of Data Stream Frequent Itemsets Mining Algorithm Based On Sliding Window
4	An Algorithm And Context Analysis Of Mining Frequent Closet Itemsets
5	Research On Multi-stream Frequent Item Set Mining Algorithm
6	Research On He Algorithm About Mining Association Rule
7	FP-Tree Based Mining Frequent Itemsets Over Data Streams
8	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application In Simulation System
9	The Research And Implementation Of Mining Frequent Itemsets Algorithm Over Streaming Data
10	The Association Rule Mining Algorithm Design And Implementation,