Font Size: a A A

Based On The Classification Of The Positive Association Rules Algorithm

Posted on:2008-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:R N LiFull Text:PDF
GTID:2208360215960751Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Classification is a basic task of data mining research, association rule mining is an important area of data mining research, classification based on association rule mining break a new path in data classification. The normal classification based on association rule algorithms usually mine all association rules set in training database. Then they classify or predict the test database by a high performance classification rule set which is chosen in all association rules set.Previous studies propose that classification based on association rule has high classification accuracy and strong flexibility. However, there are huge classification rules in the classifier, and the most of it are helpless in classification. And classification will be overfitting sometimes since it is based on only single high-confidence rule. So the key of classification based on association rule algorithms is the constructing of classifier i.e. classification rule set. We need different measurements to evaluate the performance of the classification rule set.The main content of this paper are how to get more meaning association rules and how to classify test datasets by the meaning association rules. In this paper, we propose a new associative classification algorithm, CPCAR, i.e. Classification based on Positively Correlated Association Rules. The algorithm improves the FP-Growth algorithm, it estimates positively correlation of frequent item sets when they are gotten. So the final frequent itemsets is positively correlated. The algorithm can get the original classification rules set by positively correlated frequent itemsets. For enhancing the accuracy and efficiency of classification, the algorithm selects the predictive rules in the original classification rule set by confidence to generate classifiers which are composed by positively correlated association rules.. When it classifies, at first, the algorithm selects all rules which can classify test transaction in each classifiers. Then it computes the summary of weight x~2 of multiple selected predictive rules in each classifiers, and compares class label of the classifier which weight summation is maximal with test transaction. If two class label is the same, it means classification is right. The accuracy of CPCAR algorithm in a transaction database can be gotten by the number of accurately classified test transactions and total test transactions. Although the algorithm deletes a great deal of associative rules when generates the predictive rules sets, our experiments on 14 databases from UCI machine learning database repository show that the final accuracy of CPCAR is close to C4.5 and CMAR, but the time of algorithm circulation has been reduced obviously.
Keywords/Search Tags:classification, frequent itemsets, association rules, positively
PDF Full Text Request
Related items