Association Rules Incremental Updating Research And Application Based On MapReduce

Posted on:2015-09-09

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Zheng

Full Text:PDF

GTID:2298330467987023

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Internet technology, Web information is increasing explosively, how to efficiently obtain effective information becomes popular, in which the association rules is an important research topic in the data mining field. However, the traditional association rule algorithms have been unable to meet the requirement of the data growth in data mining. Thus the real-time and efficient association rule updating is significant for trend analysis, decision-making and information recommendation.Most of exiting incremental updating methods are evolved from the traditional association rule algorithms, they are able to solve the problems, such as distributed computing and parallel I/O communication, but they still can’t solve the problem how to remove redundant computation for achieving the efficient computation. Motivated by this, we propose a parallel association rule algorithm. As the data increases rapidly, the proposed algorithm integrates the new data items and exiting data items uses the paralleling calculation to tradeoff the memory consumption and the computation cost, which aims to remove the redundant computation. The work of this dissertation mainly focuses on the following aspects:(1)We proposed an incremental association rules updating algorithm based on MapReduce. In order to avoid huge memory overheads while storage and reading, the proposed algorithm saves the Hash map modes of frequent subtrees. Meanwhile, our algorithm combines the parallel computing ability of MapReduce, it removes redundant computation and generate independent data for parallel computation, which can efficiently update association rules. We introduce the parallel design of our the algorithm, and implementation in Hadoop.(2)We give the design of our Web log data mining system in detail. In the platform of Hadoop, out System has the following function:data collection, data update, data preprocessing, data statistics and analysis, incremental updating of association rules, and the results showing, etc. Meanwhile, we propose an parallel preprocessing algorithm for Web data in the prototype system. Our algorithm can reduce the scanning number in the updating of association rules. Finally, our system shows that the incremental updating association rules algorithm based on MapReduce can more effective in data processing and application。...

Keywords/Search Tags:

Association rules, Incremental update, Map/Reduce Pattern, Web Log datamining

PDF Full Text Request

Related items

1	The Research And Application Of Association Rules Incremental Mining Algorithm
2	Research On Incremental Updating Association Rules Mining Based On Apriori Algorithm
3	Research On Apriori Algorithm Optimization Based On Binary Code And Incremental Update
4	The Research Of Incremental Update Association Rule Mining Methods
5	Implementation And Optimization Of APP Association Analysis Based On Mobile Access Traffic
6	Research And Improvement Of Algorithm For Incremental Updating Association Rule In Retail Business Intelligence System
7	Research On Algorithm Of Mining Association Rules Based On Matrix
8	Research And Application On Association Rules Mining Method
9	Research And Implementation Of Incremental Association Rules Based On Spark For Smart Phone Viruses Mininng
10	Research On Algorithm Of Mining Association Rules Based On FP Tree