Research On Improving Apriori Algorithm For Mining Association Rules

Posted on:2004-09-14

Degree:Master

Type:Thesis

Country:China

Candidate:S Wang

Full Text:PDF

GTID:2168360125463290

Subject:Computer applications

Abstract/Summary:

PDF Full Text Request

The paper begins with the practical meaning of the AR (Association Rules). We fully discuss the necessary of the research in AR and talk about the important influence of AR in the society and the commerce. AR has spent 10-year research since it was put forward by Rakesh Agrawal and Ramakrishnan Skrikant and has become one of important branches in the Data Ming world.For the knowledge's relationship, we have a deep discussion on the KDD(Knowledge Discovery in Databases), Data Mining and Association Rules. They are the base for the further work.The highlight of the paper is the research of the improved classic Frequent Set Algorithm. After talking about the details in classic Frequent Set Algorithm (Apriori Algorithm), we focus on the two improved strategies and employ the JAVA OOP technique to achieve the details in the algorithm.On one side, we theoretically prove the method that reducing the Candidate Set (Ck) can be high-powered. On the other side, employ the Hash tree to store the frequent items, to achieve fast number count of the frequent items. First, theoretically prove how the Hash tree can be used in the new problems. Then, change the abstract theoretical problems into the details with the OO programming: from the structure of the Hash tree to the addition of the leaves and to the travel problem of the tree.To test our improved idea, we select two databases as the test bed. One is the database we build ourselves. Another is that we use the anonymous web data from www.microsoft.com as the real test data. After a proper change (For instance, delete the redundant data and regulate the interface between the test database and the algorithm program.), the anonymous web data fully meet what we need.Base on the different test bed, we use lots of different cases to test our improved algorithm. Besides the association rules, we also get a lot of important test data. For example, when the confidence is fixed, with the increasing of the support, we get series of different frequent item sets, association rules and run times. Through the discussion on these test results, we make a conclusion that the new algorithm is steady and convergent. Base on this conclusion, we also make a preparation between the former algorithm and the new one. We find that the new algorithm has more advantages.

Keywords/Search Tags:

Association Rules, Apriori Algorithm, Frequent Set, Candidate Set

PDF Full Text Request

Related items

1	Research On Data Mining Algorithms Based On Association Rules
2	Examination And Optimization On The Algorithms Of Mining Association Rules
3	Research And Improvement Based On Apriori Algorithm And Its Application In Wisdom Endowment
4	Research On Association Rule Algorithm In Data Mining
5	Association Rules Candidates To Support The Study Of The Frequency
6	Data Mining Technique Application Study On Logistics System
7	Research On Correlative Algorithms Of Association Rule Mining
8	Study On Association Rules Algorithm And Application For Data Mining
9	The Algorithm Research Of Association Rules Mining
10	Research And Implementation Of Web Log Mining Based On Asociation Rules Apriori Algorithm