Font Size: a A A

The Study Of Support Vector Machine Multiclass Classification

Posted on:2013-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:R DingFull Text:PDF
GTID:2248330395486890Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Support vector machine (SVM) is a new kind of machine learning methodproposed by Vapnik and others, it is a study method suitable for small scalesample. It develops on the basis of statistical learning theory, and inheritedstructural risk minimization rule and VC-dimension theory. So it’s a very goodsolution to solve the problems in learning, improve the learning methodsgeneralization ability. Because of its excellent learning performance, SVM hasbecome the research focus in the field of machine learning, and will continue topush the important development of this field. At present, the support vectormachine gets a good application in the question of data mining, patternrecognition and so on. SVM was originally proposed for two types ofclassification problem, but classification is mostly to deal with three or moretypes questions in the application. It’s a research hotspot that how to make theSVM deals with multiclass question.In this paper, we discuss the statistical learning theory. It is the supporttheory of the SVM, and based on this theory, SVM classification principle arediscussed in this paper. Then, this paper summarizes the methods of SVMmulticlass classification, including the "one against one","one against rest”,binary tree, decision acyclic graph method. Compared the advantages anddisadvantages of each method, this paper discuss the performance of theirclassification.Through the analysis of various kinds of classification method, the binarytree method has the best performance in classification, especially for large scaleof classification problem. In view of the binary tree method’s key issues, the treestructure, the paper puts forward the vector projection method which couldmeasure the separability of different classes. And will put the easy one on the upper nodes of tree, thus reduce the influence of error accumulation which couldreduce the classification accuracy. Another problem of binary tree is some nodesexists imbalanced dataset classification phenomenon. This paper utilizes theimproved method of SMOTE sampling to resolve this question. This method doesnot change the distribution characteristics of samples, and could reduce thesample aliasing. Finally, this paper will verify each algorithm in UCI database.Compared with the former algorithm and “one-against-rest” method, this paper’salgorithm improves the accuracy of classification, especially for large scaledataset.
Keywords/Search Tags:SVM, Imbalanced dataset, Multiclass classification, Binary tree
PDF Full Text Request
Related items