Font Size: a A A

Research And Improvement Of Weighted Association Rule Mining Algorithm

Posted on:2014-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:J WanFull Text:PDF
GTID:2268330401486808Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the era of rapid development of information technology, all sorts of enterprise data are in explosive growth. How to find out potential valuable information accurately and efficiently is the problem people always focus on. Data mining, as a tool of data analysis, is used to search for knowledge or mode from the massive data which is unknown, innovative, potentially useful, and can be understandable ultimately.Association rule is one of the main branches in data mining research field. Apriori algorithm, as the classic algorithm of rule mining, is mainly used to solve association problem and mode mining in data. But apriori algorithm and current improved algorithms are not quite reasonably for putting the items in database in the equal place. In order to make mining rules more reasonable, people give a weight value on the items in the database, and meet the application requirements. So it has a profound meaning for researching weighted association rule mining algorithms. However, these existing algorithms have some problems: unreasonable weight setting of item, scanning database too much times, generating amount of candidate items, inefficient connecting and cutting procedure, scanning database serially for a long time and so on.In this paper, on the basis of extensively researching existing weighted algorithms, two improved algorithms are proposed to optimize the problem of time and space efficiency and a set of data instance is used to do a simulated experiment. The main research content and innovations are as follows:Firstly, a probability weighted association rule mining algorithm based on vector is proposed. The algorithm adopts an idea of space for time to reduce the times of scanning database; sets the probability of item in the database as its weight value to enhance the importance of high-frequency items; proposes a new cutting strategy to reduce the candidate items. And uses a set of data instance to compare with another algorithms, it shows that the new algorithm has better performance.Secondly, a weighted association rule mining algorithm based on partition is proposed. The algorithm divides the database and converts it into binary form to mine parallelly, cites a hash function to reduce the number of candidate two items and divides the frequent items according to the same prefix to reduce the connections. In the end of the algorithm, merge all local frequent items. The result of experiment shows that the new algorithm has the higher performance.All the research work was summarized in the end of this paper, and the follow-up research content as well as the prospects for the future was proposed.
Keywords/Search Tags:data mining, association rule, weighted association rule, hashfunction, local database
PDF Full Text Request
Related items