Font Size: a A A

The Research On Rule-based Classification Approach

Posted on:2014-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:X J WangFull Text:PDF
GTID:2268330425984257Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification is one of the most important analysis methods in data mining. Therule-based classification method is an important classification approach. The rule-basedclassification method is defined by given threshold values to extract rules for classification.Typical rule-based classification methods contains FOIL algorithm, associative classificationand so on. The traditional rule-based classification approaches can achieve higher efficiency.However, the accuracy of some traditional rule-based classification approaches may not behigh in some data sets, such as FOIL algorithm, decision tree. One of the reasons is that theyusually generate a small set of classification rules, especially when the training data set issmall. As a result, they may miss some high quality rules. In order to extract morehigh-quality rules and improve the accuracy of classification, this paper studies how toimprove the FOIL algorithm and how to combine FOIL algorithm with associativeclassification. In this paper, the research works are as follows.First, this paper proposes a classification approach based on multiple excellent rules. Thisapproach constructs a candidate set and a seed set. Both the candidate set and the seed set areconsisted of some important literals. It connects the seed set with the candidate set to producemore high quality rules at a time. Moreover, it combines several measurements to updates theseed set.Second, this paper proposes a classification approach of integrating associativeclassification and FOIL algorithm. This approach generates length-1and length-2classification rules. It uses the Apriori algorithm to mine the frequent pairs of items. It adoptsthe FOIL algorithm to generate classification rules based on the frequent pairs of items.Finally, this paper proposes a new classification approach based on improved multipleexcellent rules. This approach combines the classification approach based on multipleexcellent rules with the FOIL algorithm. It uses the support and confidence to generatelength-1rules. We use the candidate set and the seed set to generates the length-2rules. Itcombines several measurements to updates the new seed set. We adopt the FOIL algorithm togenerate classification rules based on the new seed set.
Keywords/Search Tags:Data Mining, Rule-based Classification, Associative Classification
PDF Full Text Request
Related items