Design And Implementation Of A Super Multi-class Classification

Posted on:2014-01-03

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Zhang

Full Text:PDF

GTID:2248330395496760

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

People can get the amount of data with the development of technology and the popularityof the network. Most data is in the form of the text. The network has become an essentialintegral part of life in people’s lives. People get information of the news, video fromwebpages which are categorized by the program to help people. People distinguishclassification data from web page with data complexity is serious problems.We have developed a super multi-class classification to solve this problem. Theapplication develops in VS2010development platform by C++language. We study theconcepts and principles of statistical learning theory. We learn the existing popular algorithmsto understand the basics of the classification. The existing popular algorithms contain Boost,Naive Bayes, k-nearest neighbor and neural network. We propose the super multi-classclassifier based on overall understanding of classification algorithm to improve thedecomposition method.First of all, we introduce the overall process and the principle of detailed exposition withthe architecture diagram and block diagram of the project. Section2of Chapter3states theprocess of design for the entire system. We carried out a detailed explanation which is dividedinto training learning and webpages classifying. We introduced our proposed the principle andimplementation of multi-classification methods in section3. The method sorts the categorieswith the binary tree for each branch so that reducing the amount of computation and increasethe classification accuracy.We achieve each module in accordance with technical proposal. The first module isgetting information from the webpage files. It extracts the title and keyword of the web page.We extract information including web content text, URL, notes, program code, label, layoutcode, and the access to the information, then we store the data into the XML file. Systempretreats page source word extraction by x2classification feature according to the informationin the XML. The process deal with feature weighted probability estimation and featureevaluation. Training learning classifier model is created by Parameter settings according to thecharacteristics of the sample files. Multi-class support vector machines based on binary treedivide into two sub-categories. Then subclass divides into two sub-sub-classes. Finally, we constitute a multi-class classification binary tree structure. We propose the multi-classproblem is decomposed into a number of binary classification problem, and in each of the twotypes of nodes classified training a support vector machine. The algorithm can effectivelysolve the problem.The final test of the super multi-class classification uses400samples and eight categories.The each category contains50samples. The final test results measured accuracy rate of anexisting classifier accuracy. We introduced a linear approach to reduce the dimension.Improved method is compared with the traditional method what prove our method has largeimprovement on the performance and the recognition rate.

Keywords/Search Tags:

SVM, classifier, training, feature extraction

PDF Full Text Request

Related items

1	Research On Feature Extraction Of Vehicle And Classifier Design
2	Design And Implementation Of A Super Multi-class Classification
3	Feature Extraction And Classification Algorithm And Their Application In Face Recognition
4	Research On Feature Extraction Of Tangut Script Recognition And Classifier Design
5	Study On Simulation-before-test Diagnostic Approach Based On Classifier
6	Research On The Technology Of Feature Extraction And Recognition Of Communication Signals
7	Research Of 3D ROI Segmentation, Feature Extraction And Classification Methods For Pulmonary CAD
8	Research On Discriminant Feature Extraction Of Human Face And Classifier Design
9	Study On Feature Extraction And Classifier Design Of Airplane Targets Based On Narrowband Radar
10	Sensiment Classification Of Micro-blogs Corpus Based On Automatic Annotation Training Set