Font Size: a A A

The Research On SVM Kernel Selection Based On The Characteristics Of Data Distribution

Posted on:2008-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:J L GuoFull Text:PDF
GTID:2178360242969432Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Support vector machine (SVM) is one type of learning machines that is paid wide attention in recent years. Based on statistical learning theory (SLT), SVM possesses many merits such as concise mathematical form, standard fast training algorithm and excellent generalization performance, so it has been widely applied in data mining problems such as pattern recognition function estimation and time series prediction, et al. At present, there are some hot topics in SVM researches, for example, model selection, fast learning algorithms et al. Because support vector machine is a kind of learning machines based on kernel. The kernel parameter selection will impact greatly on the generalization ability, and then on the performance of SVM. Therefore how to select the kernel parameter is an important issue in the SVM research. In the thesis, the selections of kernel function and relative parameter for SVM are investigated systematically. The main achievements are concluded in the following:(1) Analyzes the existing kernel function and methods for kernel selection of SVM.(2) Develops four algorithms for determining data distribution. According to the geometric sense of classification problem and starting from dataset, four determinant algorithms for rotundity distribution, cirque distribution, sphericity distribution and columniation distribution are proposed. All these provided the basis of following kernel selection approach based on data distribution.(3) Proposes a way to select kernel function and relative parameter based on data distribution. Most of the existing kernel selection methods can not consider the characteristics of data distribution and didn't make the most of the information of dataset. If kernel function and relative parameter are selected based on data distribution, the prolongation capability of SVM can be improved. This way is proposed in the thesis.(4) Provides a way to select optimal kernel function after determining data distribution. The coordinates-transform kernel, polynomial kernel and Gauss kernel are testified on the artificial dataset and real dataset respectively.The researches in the thesis are the one of key problem. The research results not only have important theoretical significance, but also direct application value for real-world problems.
Keywords/Search Tags:Statistical learning theory, Support vector machine, Kernel selection, Data distribution, Coordinates-transform kernel, Classification
PDF Full Text Request
Related items