Font Size: a A A

Research On Application Of SVM And AdaBoost

Posted on:2012-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:J SongFull Text:PDF
GTID:2178330335959424Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Support Vector Machines (SVM), proposed by Vapnik in 1995, is mainly used to solve pattern recognition problems which are small samples, nonlinear and high dimensional. It also can be applied to functions fitting and other machine learning problems. It is developed from the theory of VC dimension and structural risk minimization in statistical learning theory. SVM aims to gain better generalizing ability by seeking the best compromise between the complexity of model and learning ability with limited training samples.AdaBoost is one of the most famous boosting algorithms. AdaBoost, as the most common algorithm, has been used in various fields of machine learning. With its widely application, many people focus on the improvement of this algorithm in different ways. The embedded multi-view Adaboost (EMV-Adaboost) algorithm is to blend multi-view learning into Adaboost thoroughly, and output the final hypothesis in a new form of the combination of multi-learners.In this paper, the relative theories on SVM and AdaBoost have been seriously studied, and then the task of Chinese chunk recognition has been implemented based on SVM and EMV-Adaboost. Firstly, the relative theories of SVM have been introduced, including the optimal hyperplane, kernel function, multiple-valued classification problems and the solution of SVM. The classification principle of SVM also has been studied. Secondly, the relative theories of Adaboost have been introduced, from the algorithm of Boosting, it's the analysis of the algorithm, Adaboost algorithm to the EMV-Adaboost algorithm. The steps and analysis of EMV-Adaboost algorithm have been studied in detail. Thirdly, the task of Chinese chunk recognition has been introduced, including the definition and the mark method of chunks. By representing each Chinese character with a numerical vector, Chinese chunk recognition can be transformed into classification problems. Finally, the task of Chinese chunk has been implemented respectively based on SVM and EMV-Adaboost. The corpus of PKU has been used to verify the two methods. The results show, the F-value of Chinese chunking with SVM is 72.87%, and the F-value of Chinese chunking with EMV-Adaboost can achieve 72.87%å'Œ84.06%. The achievements of this paper can be applied to other nature language processing fields, such as translation systems, text classification and information retrieval and so on.
Keywords/Search Tags:Support Vector Machines, AdaBoost Algorithm, Chunking
PDF Full Text Request
Related items