Research On Association Rule Algorithm For Massive Data Set

Posted on:2008-02-28

Degree:Master

Type:Thesis

Country:China

Candidate:X X Liu

Full Text:PDF

GTID:2178360215482639

Subject:Computer Science and Technology

Abstract/Summary:

Data mining is the discovery of information or patterns that are interesting, non-trivial, implicit, previously unknown and potentially useful in large databases. Association rule mining is one of the most important research methods, which was developed by Agrawal to find out relations of different commodities in transaction databases. But with the rapid development of Internet technology and database technology, Data mining makes the data processing which needs a growing scale. Massive classic algorithms of Association Rules, consumed a lot of time and space. The result is not satisfactory. So, it has put in a lot of improved data reduction strategy, includes distributed parallel processing, batch processing, incremental processing and so on.The article aims at the characteristics of the massive data sets , and does some research about the association rules mining algorithm. Firstly it aims at the skew distributed characteristic of the large data sets, and puts forward the weighted association rules mining arithmetic based on the density biased sampling, density biased sampling can produce the representative sampling when deals with skew data sets, comparing with the random sampling.Then do a supporting counting with the weights of part density calculating samples gained when sampling. There is no need to reduce the min_support. frequency itemset is produced by F_k-1Ã—F₁ mode of connection and apriori previous knowledge. Only scanning the data set a time, the experiment proved that when dealing with massive data sets of skew distributed, it not only has a good efficiency, but also improves correctness, so it is a high valid algorithm on dealing with association rules mining of massive data sets. Finally, this algorithm is used in the field of intrusion detection systemã€‚Secondly according to the character of massive data set density, using Granular Computing and rough set, combining with association rules mining algorithm, an method based on Granular Computing about association rules mining is given. Through the nature of Granular, a number of candidate itemsets are minimized and frequency itemset is mined by the application of depth first search strategy. Finally, the effectiveness is proved through the experiment.

Keywords/Search Tags:

massive dataset mining, association rule, granuclar computing, density baised sampling

Related items

1	Analysis On Sampling Complexity Of Association Rule Mining
2	An Algorithm Based On Density And Grid For Mining And Clustering Association Rules
3	Based Sampling Of Distributed Association Rule Mining Algorithm
4	Updating Method And Implement Association Rules Based On Probabilistic Graphical Models
5	Research On Mining Algorithm Of Association Rule And Its Application For Biological Data
6	Association Rule Mining Expansion Of Research In The Area Of disaggregated Data
7	Studies And Applications Of Association Rule Mining Methods In Data Mining
8	Association Rule Mining On Cloud Computing Platform
9	The Research And Implementation On Association Rule Mining Algorithm Based On Spark
10	Research And Application For Association Rules Mining Based On Distributed Computing