Font Size: a A A

The Key Technologies For Practical Applications Of Support Vector Machines

Posted on:2006-11-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:C H ZhengFull Text:PDF
GTID:1118360182460102Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Data-based machine learning has always been an important and one of the extensive research areas and hotspots in intelligent systems. This problem is very general, which covers important topics of artificial intelligence, in particular, classification, regression analysis, and density estimation. The main duty of the machine learning is to discover the rule from a given collection of data samples, and give a good predict of the unkown data. The most important theoretical foundation of the machine learning is statistics. But the classicial statistics is developed for large samples and based on using various types of a prior knowledge, such as the neural networks is constructed based on these hypothesis. However, for the most practical problems, the collected data samples are limited and the priori knowledge is not easy to obtain, such that some excellent machine learning algorithms in theory can not give a satisfactory results in practice.Statistical learning theory (SLT) is a theory that explores ways of machine learing for small data samples and does not rely on a priori knowledge about a problem to be solved, which is developed based on the empirical risk minimization (ERM). SVM is a typical representative of SLT and is constructed based on the ERM. SVM has been recently proposed as a new effective learning machine for small data samples, and has been made notable successful applications and is still going in his development and enrichment.To faciliate the practical applications of SVMs, some key technologies are detailed developed in this dissertation. The dissertation has four parts: the fuzzy logic technique is introduced to the conventional SVMs to either improve the speed of learning or improve the generalization performance of SVMs;a genetic algorithms-based automatic model selection method is proposed to solve the central open question in SVMs design;systematic and credible evaluations of several commonly-used easy-to-compute generalization performance of SVMs are performed, by using the developed exponential-coded genetic automatic model selection for SVMs in a large range of model parameters;a new simple and high efficient generation performance of SVMs is proposed.The detailed works are given as follows.(1) In the most available SVMs, the sparse support vectors (SVs) thatcharacterize the SVMs are obtained from all the training sample set. These computations are not only conducted on the SVs but also on the non-SVs, which increases the unnecessary calculation and results in low speed of training. To fast the speed of the traditional SVMs training, a simple fuzzy support vector machine (FSVM) is proposed. The proposed FSVM uses a proximal SVM (PSVM) and fuzzy logic technique to pre-extract the potential SVs in the original training sample set, then the final SVs are obtained by a traditional SVM in the new sample set that composed of the pre-extracted SVs. The proposed method improves the speed of SVM training significantly without degradation of the generalization ability. Experiments conducted on manual and benchmark data sets demonstrate the high performance of the proposed FSVM in comparison with the traditional standard SVM.(2) Based on the detailed analysis of the refused and missed classifications in the available multi-class SVMs, an explicit and easy-to-compute fuzzy membership function is proposed and thus decision functions for the refused and missed classifications are formulated. A simple high performance fuzzy multi-class SVM (FMSVM) is developed by using the proposed fuzzy membership functions and fuzzy decision functions. The better identification rates on small benchmark data sets, handwriting digital data and the deformity high-resolution range profile of radar demonstrate the better performance of the proposed FMSVM.(3) SVMs often require expensive design phases to choose adequate model parameters to attain high classification accuracy. Choosing optimal model parameters for SVMs is one of the central open problems in SVMs design, which is usually done by using the trial-and-error at present. In this dissertation, a real-coded genetic algorithm (RGA) is proposed to automatic determine the model parameters for SVMs, aimed at expediting the model selection process in SVMs design with optimal generalization performance. Compared with the commonly used trial-and-error methods, the proposed RGA-bascd automatic model selection for SVMs is simpler and easier to implement. Furthermore, the generalization of the RGA-based SVMs is much improved. Experimental tests conducted on 2-value remote sensing images demonstrate that the proposed approach can conduct automatic model selection with low error while providing significant savings in time.(4) Choosing optimal model parameters for support vector machine (SVMs) is an important step in SVMs design, which is usually done by minimizing either an estimate of generalization error or some other related performance measures. First, we extend our work on real-coded genetic algorithm model selection to exponential-codedgenetic algorithm (EGA) automatic model selection for SVMs, so that it can be used on large data sets with large range of model parameters. Second, by using the developed EGA-based automatic model selection for SVMs, several commonly-used easy-to-compute model selection criteria including single validation estimate, radius-margin bound, support vector count, testing error probability bound, approximate span bound, and risk bound are compared and evaluated in a wide range of parameter space. The results performed on benchmark data sets show that there is none of the general generalization performance yields a uniformly favorable result for all SVMs. But for SVMs with L\ soft-margin formulation, the support vector count can serve as a nice model selection criterion and give a very good prediction of generalization error of SVMs.(5) Based on the fact that SVMs are constructed by the tradeoff of the empirical risk and VC confidence, an efficient generalization error estimate is proposed, by explicitly incorporating the empirical risk into the conventional support vector count. Furthermore, to fast the computation, only the first category support vectors associated with small Lagrange multipliers are counted in the proposed generalization performance. The optimal model parameters for SVMs is obtained by using the aforementioned EGA-based automatic model selection, for a given generalization performance. Comparative results performed on the benchmark data sets demonstrate the better performance and higher efficiency of the proposed generalization performance in comparison with the conventional radius-margin bound and support vector count.
Keywords/Search Tags:Statistical learning theory, support vector machines, pattern recognition, model selection, automatic model selection, generalization error estimate, fuzzy logic, genetic algorithm, exponentional-coded genetic algorithm
PDF Full Text Request
Related items