Construction Method For Training Data Set In Classification Algorithm Of Support Vector Machines

Posted on:2010-02-02

Degree:Master

Type:Thesis

Country:China

Candidate:X Yu

Full Text:PDF

GTID:2178360278466678

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Support vector machine (SVM) is a novel method on machine learning which was proposed by Vapnik and some other scholars. Based on the statistical learning theory and optimization theory, it is an implementation of structure risk minimization principle which belongs to the statistical learning theory. But in practical application, the data which needs to be treated is always massive or asymmetrical.So how to improve the treating ability of SVM on complicate data to make the application of SVM more wide has become a hot-point on research at present. For this question the researches included in the thesis can be summarized as follows:Firstly, for the question that training samples are asymmetrical and some classes of the training samples are very few for classification problem, a novel method to construct virtual samples which based on Gaussian distribution is introduced. This method based on the theory of Gaussian distribution and so on one side the correction of the virtual samples can be assured, and on the other side it can make full use of the prior knowledge and can be applied widely in many kinds of classification problems. Incorporate svm algorithm with this method and emulate an experiment on Iris data set and on kdd cup 99 data set. The result of the experiment shows that it can make full use of the prior knowledge and construct enough correct virtual samples, so the precision of the classification can be improved effectively.Secondly, for the question that svm algorithms run slowly for massive data treating, a novel method for pre-extracting support vectors based on improved vector projection is introduced on a deep investigation into the characteristic of support vectors. Under the linearly separable condition, this method uses the best vector projection which is obtained by fisher linear discriminant algorithm instead of the central vector. Under the nonlinearly separable condition, this method uses the true central vector instead of the approximate central vector in the feature space. By selecting a more rational vector projection, this method can use fewer margin vectors instead of the original vectors in the training process and make sure that the classification effect is good. It can reduce the training samples greatly and speed up the training rate. The results of the experiment also prove the validity and feasibility of this method.

Keywords/Search Tags:

statistical learning theory, support vector machine, gaussian distribution, vector projection, virtual samples

PDF Full Text Request

Related items

1	Incremental Learning Algorithm Of Support Vector Machine Based On Vector Projection
2	Research On Some Problesm Of Support Vector Machine Learing Algorithm
3	Study Of Support Vector Machines Algorithm Based On Statistical Learning Theory
4	Research Of Support Vector Machine Learning Algorithms
5	Study On The Key Problem Of The Competing Learning Vector Quantization And Support Vector Machine
6	Application Of A Small Amount Of Labeled Samples Support Vector Machine Classification
7	Research On Several Problems In Support Vector Machine And Support Vector Domain Description
8	The Study Of Several Issues And Application In Statistical Pattern Recognition
9	The Application Of Support Vector Machine In Industrial Inferential Measurements
10	Some Research On Support Vector Machine