Font Size: a A A

Research On Classifier Performance Evaluation

Posted on:2011-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:T T WuFull Text:PDF
GTID:2178360305459956Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Data Mining is the theory and method on researching how to mine knowledge from data in very large databases in nontrivial methods. Classification, as an important theme in data mining, has been researched earlier in statistics, machine learning, neural network, expert systems, etc. As an important part of the classification process, classifier performance evaluation plays a very important role in guiding the appropriate classifier selecting.In this paper, we firstly introduced the concepts and basic techniques about data mining and classification. Then, we detailedly summarized the common standards and methods for evaluating the classifiers. Then, we detailedly analyzed the implementation of the classifier performance evaluating as well as the mathematic sense of primary measures of evaluation under the WEKA platform. Finally, we proposed a method of erro decomposition which based on the restrictive bayes classifiers. This method is based on the the method of bias and variance decomposition for 0-1 loss function. We induced the probability of restricted bayesian classifier forecast into the error decomposition process. In this approach, the classification error is decomposed into two parts called bias and variance. The bias reflects the deviation between the average forecast of learning algorithm and the real values, while the variance reflects the fluctuation of learning algorithm predict performance on different data sets.In order to illustrate the role of the algorithm, an experiment was conductied on three kinds of restricted bayesian classification algorithm while on nine UCI data sets. The experimental results indicated that TAN classifier is the optimal classifier. And the composition of the three retricted Bayesian classifier is clearly observed and the reseason why TAN classifier is the optimal classifier by this erro decomposition method.
Keywords/Search Tags:classify, classifier, performance evaluation, Restricted bayesian classifier, bias, variance
PDF Full Text Request
Related items