Font Size: a A A

Classifier Container Based On Integration Evaluating Method

Posted on:2006-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:J F ZouFull Text:PDF
GTID:2168360152475692Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification is a very important task in data-mining, at present it is used mostly in commerce. The aim of classification is to advance a classification function or classification model (also named classifier). This model can map the data of database to a certain class. A lot of static methods and machine learning methods are used to classify texts automatically. Automatic text classification is including three courses: first of all, processing the texts and change the texts into digits; next, constructing and train classifier; at last, classifying new texts.In light of the investigation of existing methods of classification, especially according to the different distinguishing degree of different classifier, a integration evaluating method based on decision-making is developed to take advantage of each classifier's distinguishing-degree of every class, and they can learn from others's strong points to offset theirs weakness. On training it refers to Hook-Jeeves method. Sequentially a classifier container is formed. In the course of pre-processing, first it compares many methods of attribute extraction to select a suitable one, second it selects a fitted weight-calculating method for this system, and last it changes texts into vectors. On training the classifier, it constructs four classifiers firstly, and takes use of materials of FuDan University to test and analyze, and advances a classifier container according to integration evaluating theory secondly, when the container is trained, it educes an power matrix. When the container is being tested, every child-classifier will give a result, and the container will compute the sum in light of the power matrix and the given results, and it selects the maximum sum of all classes to be the class.This container is an optimizing combination of different classifiers, giving an better result.In this paper it uses SVM text classification, multi-group distinguishing text classification, Naive Byes text classification and simple vector distance text classification.
Keywords/Search Tags:Classification, Container, Integrated Evaluation
PDF Full Text Request
Related items