Font Size: a A A

Research Of Association Rules Mining Based On Vertical Data

Posted on:2010-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:L YangFull Text:PDF
GTID:2178360278475678Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is method that obtains the unknown and potential usable pattern from a mass of data. The association rule mining is an important research content of data mining and a very active research field which has made a rapid development in recent years. And it is applied to discover the interesting connections between the different items or attribute. With the high-speed increment of the collected and stored data, more and more people are interested in obtaining the association rules from their database. To meet the various demands of the users, a series of research on improving the performance and functions of the algorithms of the association rules mining had been carried out.In this paper, an overview of the association rules mining is given, and analyze some popular relevant algorithms, such as Apriori,DHP,FP-growth based on horizontal data, and Eclat,Diffset based on vertical data. Prepare for bringing forward an association rules mining algorithm with a better performance. Then, an association rules mining algorithm, ADFAR, which applied for vertical data based on incidence matrix depth-first are proposed. This algorithm describes the relations of any 2 itemsets with incidence matrix. Using incidence matrix to restrict the producing of candidate frenquent itemsets, so that can decrease the number of candidate frenquent itemsets. This algorithm makes use of incidence matrix to produce frenquent itemsets by strategy of depth-first, it only needs intersection operation one time when produced a k-frequent itemset. This algorithm adopts bitmap to store the support sets of frequent itemsets, has a lesser spending of memory. This algorithm doesn't need to scan memory many times, avoids multifarious candidate item sets producing and validating, and has good maneuverability. Experimental results indicate that the algorithm we put forward overcomes the disadvantages that Apriori and its relative algorithms produce large amount of candidate itemsets and require scanning database many times. Mining efficiency is high.The association rules mining algorithm which applied for vertical data based on incidence matrix depth-first adopts bitmap to store support sets of frequent itemsets. Using bitmap space to store support sets already reduces data space in the memory, but it is the main space expense of the algorithm, and also a key factor that restricts algorithm's expansibility. Therefore, in this paper, we will present an improved algorithm which adopts compressed bitmap to improve on vertical association rules mining algorithm. It compresses the support sets which will be put into the memory to achieve the purpose of saving memory space. In this paper, we will introduce bitmap compression and the intersection operation based on compressed bitmap in detail. Our experimental results indicate that the bitmap compression algorithm for vertical association rules mining makes the rate of compreesion achieving 70% and decreases memory space when the process is running.
Keywords/Search Tags:association rules mining, vertical data, depth first, incidence matrix, bitmap compression
PDF Full Text Request
Related items