Font Size: a A A

Research On Some Variants Of Support Vector Machine

Posted on:2010-04-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z DuFull Text:PDF
GTID:1118360302491055Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Statistical learning theory is a theatrical framework of machine learning for small samples. During the past decades, it has been developmenting to be a relatively comprehensive system of theory. Support vector machine (SVM) comes out as a new machine learning algorithm based on this theory. According to the structural risk minimization (SRM) rule, it can get the global optimal linear decision function in a higher dimensional feature space via a kernel function. It avoids the curse of dimensionality and is of good generalization ability. Since its good performance in pattern recognition, function approximation and density estimation, it has attracted a great attention of researches, and developed rapidly in theory, computing and applications, and becomes a hot topic in machine learning.In order to improve the training speed and/or generalization ability of traditional SVM, This dissertation mainly focuses on the research of the theory and application based on some variants of SVM. The contents in this dissertation are described as follows.1. A review of current status of related research of SVM is given, and then it is followed by a brief introduction of the fundamentals of SVM.2. A study on several SVM variant based on kernel minimum square error (MSE). Firstly, the geometric description of Least Square SVM (LSSVM) classifier is described; Secondly, the Proximal SVM (PSVM) model for classification problem is extended to regression problem, thus Proximal Support Vector Regression Machine is presented, as well as a fast computing method based on Cholesky decomposition. The equivalence of classification model and regression model is also proved. Taking the advantages of LSSVM and PSVM,a new model called Direct Support Vector Machine (DSVM) is proposed. The new model can be used both in classification and regression problems, but be much simpler and has faster training speed and higher generalization ability. Compared to LSSVM, it enhances the convexity of the problem, guarantees to get the global optimum; compared to PSVM, it overcomes the disadvantage of differences of linear and nonlinear cases, and is higher in testing speed. In the end, comprehensive numerical experiments show the feasibility and affectivity of all these researches above.3. A study on fuzzy SVM (FSVM). Firstly, the equivalence of FSVM and SVM with multiple penalty factors is proved theoretically, which is the theoretical foundation of setting the fuzzy membership adaptively as hyperparameters of models. Secondly, as the strategy of pre-setting the fuzzy membership, two designing ways are given, based on the geometrical distribution of classification hyperplane and data samples and the nature of classification of SVM, respectively. Then the numerical experiments are done to show the effectiveness of these two methods, as well as comparisons with other methods available.4. Evaluating the performance of fuzzy SVM with L2 loss function (L2-FSVM). Firstly, the model of L2-FSVM is given, and then it is transformed equivalently as hard margin SVM with a new kernel function, in which the fuzzy memberships are restated as kernel parameters. Secondly, four methods of estimating the generalization error of hard SVM are extended to be applied in that of L2-FSVM. In the end, via comparative analysis and overall numerical experiments, the best estimation of generation error of L2-FSVM is concluded, which can be used as a criterion of model selection.5. Research on the model selection method of dual-penalty-factor SVM with L2 loss function and its application. Since the imbalance of binary class of data in the digital mammography, SVM with L2 loss function and different penalty factors are used. Then, according to the previous research results, one method of determining these hyperparameters automatically is presented, via minimizing the generalization error bound. By the experiments of detection of mass and microcalcifications in the digital mammography, the effectiveness of the proposed method is demonstrated. It is concluded that this method outperforms other setting ways of hyperparameters in terms of generalization ability.
Keywords/Search Tags:Statistical learning theory, Support vector machine, Kernel function, Minimum square error, Cholesky decomposition, Fuzzy membership function, Model selection, BFGS quasi-Newton method, Mass detection, Microcalcifications detection
PDF Full Text Request
Related items