Research Of Parallel Association Rules Algorithm Based On Hadoop

Posted on:2016-10-07

Degree:Master

Type:Thesis

Country:China

Candidate:Y Bi

Full Text:PDF

GTID:2308330473964427

Subject:Computer application technology

Abstract/Summary:

In the era of the Big Data, as the significant assets of enterprises and public organizations, data is changing the concept of enterprise assets and the development progress historically. As an important research area and technology of data mining, using association rules can find some characters and the interdependent relationship from large scale data. And by extracting all kinds of unkown and useful information from these intangible data asset, enterprises can get more and more tangible benefits and even modify their development strategy and business model. When using traditional association rules algorithm, an important research direction of data mining, to deal with the large databases, many problems like more I/O operation and large computation and so on are happened. With the fully development of cloud computing platform Hadoop, the combined utilization between the association rules algorithm and distributed computing framework is growing trend.Based on a better understanding of the basic concepts of association rules and classic algorithm, this thesis improves the existing serial association rules algorithm through introducing the concept of frequency set tree and modifying the usage of the matrix and name the new one as R-SLI. In addition, using the direct parallelization strategy, this thesis designs P-MT algorithm that implements R-SLI running parallelly on MapReduce framework. Finally, program the algorithm and explore the algorithm performance in different experiment datasets and different threshold value. Through the analysis of experiment results, it shows that this algorithm has higher performance.

Keywords/Search Tags:

Data Mining, Association Rules, MapReduce, Matrix, Frequency Tree Set

Related items

1	Research And Application On The Technologies In Mining Association Rules
2	The Research Of Parallel Association Rules Mining Algorithms Based On Cloud Platform
3	The Research On The Algorithms Of Mining Association Rules
4	Research On Association Rules Mining For Marine Environmental Data Using MapReduce
5	Research And Application Of Multidimensional Data Constructing And Association Rules Mining Algorithm Based On Mapreduce
6	The Research And Implementation Of Algorithm For Mining Association Rules Based On BigData
7	Algorithms For Fuzzy And Objective-Oriented Association Rules Mining Based On FP-tree
8	Design About Association Rules Mining Based On Items Clustering And Transaction Tree
9	Research And Application Of Association Rules Mining Based On FP-tree
10	Research And Application Of Association Rules Mining Algorithm Based On MapReduce