Font Size: a A A

Research On Algorithm For Incremental Updating Association Mining Based On Inverted Index

Posted on:2017-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:C XuFull Text:PDF
GTID:2348330491957528Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Association rules mining is used to discover potential valuable relationship among massive data items, in order to business decisions for increase the profit of enterprise.With the rapid development and the widespread application of computer science and technology, such as mobile Internet, artificial intelligence, information processing, machine learning, networking and so on, which make explosive growth of multifarious information data. All kinds of data mining technology were proposed recently in order to extract valuable information from mass data set. The Incremental Updating Association Rules Mining is a dynamic association rules mining method. It is mainly used to solve the problems of found the potential valuable association rules among the data items, when lots of transaction records in the dynamic database had constantly updated with time and the minimum support and the minimum confidence had changed frequently to meet the need of different users needs. Aim at the existing improved incremental updating association rules mining algorithm exists the following shortcomings: firstly, frequent scanning the original transaction database, generate a large number of useless candidate itemsets and using a collection of join operation to compute frequent itemsets; secondly, new association rules can not meet users requirements; thirdly,when lots of transactions add into the original transaction database, at the same time change the minimum support threshold and minimum confidence threshold, research on the realization of Incremental Updating Association Rules Mining maintenance problems are rarely. In order to solve those problems, this paper proposes an efficient incremental updating mining association algorithm UP-IITree, which combines the inverted index with tree structure for without scanning the original transaction database DB, without generating candidate itemsets and using collection and operation, can effectively calculate all the frequent itemsets when the original transaction database had updated. The experimental results show that the algorithm occupies less memory space, searching frequent itemsets with high efficiency, and can better solve the problem of incremental updating associated mining algorithm had existed.Under the environment of big data, there are lots of transaction data sets added to the original database, and the specified minimum support threshold and minimum confidence threshold will changed to meet the different needs of users.In order to be able to update the association rules and maintenance it in timme has always been the goal people pursuit. In this paper, the UP-IITree algorithm is implemented to further innovation for propose a parallel Incremental Updating Association Mining Algorithm UP-IIMR,which combine the inverted index technology with MapReduce parallel programming model. The algorithm is using the MapReduce parallel programming model under the Hadoop platform to apply inverted index technology parallelly for increase a volume new data sets, at the same time change the minimum support threshold and minimum confidence threshold to solve the problem of association rules are hard to maintenance efficiently, timely. By using real data experimental results show that the algorithm UP-IIMR has greatly improved the efficiency of mining association rules and reduces the required memory space to a great extent, and effectively solve the incremental updating association rules are difficult to maintain in big data environment.
Keywords/Search Tags:Inverted Index, MapReduce, Incremental Updating Mining, Frequent Item Sets, Association Rules
PDF Full Text Request
Related items