Font Size: a A A

The Reseacher On Support Vector Machine Networks For Text Classification

Posted on:2008-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:X G MengFull Text:PDF
GTID:2178360212493742Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Support vector machine (SVM) based on the Statistical Learning Theory is a new approach and research field in machine learning because of its advantage such as firm mathernatic theory foundation, strict theory analysis, complete theory, global optimization as well as good adaptability and generalization. SVM improves the algorithm generalization effectively and minimizes the empirical risk simultaneously by using Structural Risk Minimization and synthesizing the techniques including the statistical learning, machine learning and neural networks, etc. It also has good latent application values and development prospects compared with the conventional machine learning methods.In this thesis, several typical support vector machine algorithms are generalized , present a new framework that adapts the SVM. The performance and applications of the algorithms are studied in depth. The research is carried out in the following aspects:1. We make a general introduction about the concepts, methods, categories and applications of document classifying.2. The solution methods of support vector machine, including quadratic programming method, chunking method, decomposing method, sequential minimization optimization method, iterative solution method named Lagrange support vector machine based on Lagrange function and Newton method based on the smoothing technique, are studied systematically. The methods employs solving convex quadratic programming directly or solving convex quadratic programming after converting the large-scale problem into many sub-problem or utilizing sophisticated optimization techniques after converting the constrained optimization problem into unconstrained ones. The theory foundation for presenting new support vector machine algorithms is laid by means of analyzing those methods.3. The support vector machine (SVM) was originally designed for two-class classification, and many researchers have been working on extensions to multiclass. In this thesis, several methods have been proposed including "one-against all", "one-against-one", DAGSVM, Classification method of multi-class SVM based on binary tree, and so on. Moreover, their pluses, minuses and performances are compared; present a new framework that adapts the SVM.4. Finally, with combination of text classification and method of support vector machine, a text classification system is designed and implemented. We use the common indicators, such as precision, recall and F value to judge the result of the text classification system. Experimental results show that the overall average of the system's indicators is high and the system has good result of classification.
Keywords/Search Tags:text categorization, support vector machine, multiclass
PDF Full Text Request
Related items