Font Size: a A A

Research On The Technologies Of Association Rule Mining Algorithm

Posted on:2012-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2218330338970830Subject:Computer software theory
Abstract/Summary:PDF Full Text Request
Along with the development of information industry, means of people getting data and knowledge has tended to diversification. The volume of data has become larger and larger, hidden behind these data quantity maybe exist some information we are interested in. And how to mining these useful information effectively has become an urgent problem. Following the data mining technology appearing, the humans has started to solve off this problem effectively. The data mining technology incorporates a large range of disciplines, for example database, mathematical statistics, machine learning, pattern recognition and artificial intelligence and so on. It can extraction the unknown information and knowledge disorderly from large,noisy and irregular databases. It is widely used in many fields and has become one of the most active and extensive researching subject. After R.Agrawal put forward association rules minning algorithm based on the Apriori, many scholars put forward a lot of improvement algorithm.This paper makes a detailed research on association rule problems, particularly discusses the FP-growth algorithm. The FP-growth algorithm has most extensive wide, it put transaction database compression to a FP-tree. The biggest advantage of FP-growth algorithm than Apriori algorithm is that it doesn't need to generate candidate frequent itemsets and only needs scanning database twice. But there is still existing some problem, such as it need to produce large amounts of FP-trees, cannot mining of large database efficiently, etc. Aiming at the shortcoming of FP-growth algorithm, the paper has done some improvements:through the reduction transaction database, reducing the second scanning the volume of data of the database. Added a table based on hash tables, it leader to watch the improvement of storage structure, can improve the auxiliary table to study leader table lookup time complexity; created inverse FP-tree and update its structure as to save space where the FP-tree occupy. At last, the paper will give the experiment results and performance analysis, to prove the correctness and efficiency of the improved algorithm.Finally, according to the problem of the existing algorithm of FP-growth cannot dig a large database effectively, this paper combines database compression technology (sampling and dividing) and the advantages of the improved algorithm, to propose an improved fast mining model. The fast mining model can quickly mining magnanimous database and the highest possible guarantee the accuracy of the results. It is also the focus problem the paper will to research seriously.
Keywords/Search Tags:Data Mining, Association rules, FP-tree, The improvement algorithm based on FP-growth
PDF Full Text Request
Related items