Font Size: a A A

Studies On Algorithms Of Association Rule Mining In Data Mining

Posted on:2012-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:L HuiFull Text:PDF
GTID:2178330332491517Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the wide use of computers, scanners and data base technique, human accumulated a great deal of historical data. These data look simple at the surface of them, but, there are much valuable information behind them. In data prediction, business decision and resource management,the knowledge and rule behind these data are very useful. But, if we still use traditional methods of statistical and analyses, these useful information can't be discovered or can be found in infinite time. Hence data mining has been proposed on this occasion.As one of the main research patterns in the field of data mining, association rules are used to determine the relationships of a set of item, to find out valuable information. Frequent item mining,the main task of the association rule mining, the efficiency of which is the difficult problem. In this paper, relevant knowledge of frequent itemset mining is introduced and some classic algorithms are analyzed in detail. For the maximum frequent contains all the frequent itemsets, this paper focuses on how to mining maximum frequent itemsets, the maximum frequent mining from generating FP-tree, the prune strategy, superset checking, first searching strategy, reducing dimension are deeped researched.DFP-tree is proved from traditional FP-tree. In this paper, the definition and construction process of the DFP-tree is proposed and DFP-Max of the maximum frequent mining which based on DFP-tree is proposed. This algorithm uses the strategy of prediction and pruning to reduce the number of generated condition DFP-tree, not only can reduce the checking times but also can avoid the combination connection of intermediate results by using digital set matching instead of the testing strategy of item-sets matching .The experiment shows that the efficiency of DFP-Max is two to five times as much as that of the similar algorithms in the case of a relatively small support.Dimension is the number of item in an itemset. In the fifth part, taking decreasing dimension of itemset as guiding, basing on anaslying FPMax and DMFIA algorithms, the depth-first and breadth-first strategy is combined and the algorithm BDRFI(algorithm for mining maximum frequent itemsets Based on Dimensionality Reduction of Frequent Itemset)is proposed. In order to increasing effectively of the algorithm, using dimensionality reduction and the ideological divide to solve the problems of the DMFIA algorithm excessive candidate maximum frequent itemsets and FPMax algorithms needing to recursively the problem of mining frequent itemsets. The experiment shows that the efficiency of BDRFI is two to eight times as much as that of the similar algorithms.
Keywords/Search Tags:Data Mining, Association Rules, Maximum Frequent Itemsets, FP-tree (Frequent Pattern Tree), Superset Checking
PDF Full Text Request
Related items