Font Size: a A A

DRAC: Directly Mining Non-redundant Rules For Associative Classification

Posted on:2012-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:J Z SongFull Text:PDF
GTID:2178330335470093Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Classification and association rule mining are both active fields in data mining. In real life, they also have a wide range of applications. The aim of classification is to build a classifier by anglicizing a training dataset and then using the classifier to predict the unlabeled object. The main task of association rule is to discover the interesting relationship between the objects in database. However, if the consequent (right-hand-side) of the rules is the class label, it could be used into building the classifier. To apply the method of the mining association rule into classification area opens up a new idea for classification. A large number of experiments show that the classification based on association rules (Associative Classification) has higher accuracy and stronger adaptability advantages compared to the traditional classification methods. There are three steps in the typical associative classification:(1) Mining classification associative rules (CARs); (2) Pruning the rules and building the classifier; (3) Predicting the new case unknown class label. Unfortunately, in the first phase, a large number of rules are usually discovered and there exist much redundancy. Under such circumstance, mining so many rules not only slows down the efficiency, but brings great challenge to prune, save and search such rules. What's the worse, the computer can not deal with rules in that large scale if the database is dense or the minimum support threshold is small.To solve the problem of the efficiency, this article proposed a new method:DARC (Directly mining non-redundant Rules for Associative Classification).Our method extends the GrGrowth which is an efficient mining frequent free set algorithm. In the frequent free set mining process, we introduce the confidence judgment, redundancy check and mining the non-redundant rule set directly for building the classifier. At the same time, we use multiple strong rules to classify the unlabeled dataset correspondingly to avoid the over-fitting problem by single rule. According the experimental results, we conclude our method is more efficient than the typical approach CBA without losing of accuracy.
Keywords/Search Tags:classification, association rule mining, associative classification, free set, non-redundant rule set
PDF Full Text Request
Related items