Font Size: a A A

Research On SVM Classification Algorithm Based On Random Subspace

Posted on:2017-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y L WeiFull Text:PDF
GTID:2358330482491344Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Support vector machine(SVM) is based on the same solid theories,it is a kind of machine learning methods,and it can be a very good solution for small sample,nonlinear,high dimension and local minimum and similar problems. Currently,as a kind of potential classification technique, support vector machine(SVM) has been widely used in data classification and research. But the early data classification method based on support vector machine(SVM),In dealing with huge amounts of data classification, especially unbalanced data classification will appear many problems,which will seriously affect the computational efficiency and accuracy of data classification algorithm.Random subspace method is in order to adapt to the challenge of selecting data characteristics,its idea is chosen valuable characteristics from a large number of features, in order to reduce the feature dimension of data set or balance the distribution of features,it has outstanding contribution to data preprocessing.Therewith,This paper puts forward that the research of SVM classification algorithm based on random subspace,in the paper the main work is divided into the following two aspects:1. SVM Algorithm research based on the random feature subspace and the weighted kernel functionCombining with the idea of random subspace and kernel function, the paper proposes SVM algorithm based on random feature subspace and the weighted kernel function.First, adopting the ReliefF algorithm to calculate feature weights.Then based on the random feature subspace method, according to the feature weights to select the characteristics.Finally, the selected features and weights are applied to the production of the weighted kernel function.Therefore,which will reduce the computational complexity of the weighted kernel function.In the study of classification of the balance data set, to a certain extent, the method has solved the traditional problem of low efficiency and low accuracy of the SVM algorithm.2. SVM-based imbalanced data classification algorithmCombining with the re-sampling technology and the method of stratified sampling,the paper proposes the SVM-based imbalanced data classification algorithm.The algorithm is based on the support vector machine(SVM),First adopting the stratified sampling method to select the positive and negative characteristics of samples,which is to balance the underlying distribution of the sample characteristics,And then using the re-sampling technology to balance the number of samples,So from the underlying distribution of sample characteristics and the samples size to solve the imbalance of the data sets.In the study of the classification of imbalanced data sets, the method avoids the phenomenon that researcher only consider the imbalance of sample size and ignore the imbalance of the data characteristics' underlying distribution,which reduce the influence from the imbalance of data sets to SVM classifier.
Keywords/Search Tags:Support vector machine(SVM), Random subspace, Kernel function, Stratified sampling, Imbalanced data sets
PDF Full Text Request
Related items