Font Size: a A A

Application Of A Small Amount Of Labeled Samples Support Vector Machine Classification

Posted on:2012-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:P ChengFull Text:PDF
GTID:2208330335471173Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology, together with a sharp increase storage equipment, a large number of unlabeled samples collected in mind is not difficult, but access to a large number of samples is labeled particularly difficult.How to mark only a small number of samples and a large number of unlabeled samples to improve the learning performance, increase the accuracy of classification has now become a hot research in machine learning.Moreover,Support vector machine (SVM) is a new machine learning method after Aartificial neural network (ANN),the theory basis of which is Statistical Learning Theory and Structural Risk Minimization principle.In comparison with other traditional machine learning algorithm, it substitutes kernal function for nonlinear mapping from sampling space to feature space,which will improve data dimension and better solve the problem that training set error is smaller and test sets error still is bigger.Consequently,in recent years the way has become a new technology in machine learning domain and has been widely used in classification and regression problems.The goal of SVM is to seek for the optimal hyperplane that it can differentiate the feature space.The key of SVM is to maximize classification interval.The eventually results depend on support vectors,rather than no-support vectors.The computational complexity of the method depends on support vector number.rather than the dimension of sampling space.A few support vector determining the final classifying result.it can not only help us catch the key labeled samples,but also eliminating a lot of redundancy samples,which is bound to the simpleness and robustness the method.If it chooses hidden support vectors in advance,the number of labeled samples will be reduced,Which not only increases the solution speed of quadratic programming in SVM training algorithm,the more important is to reduce the size of labeled samples.Based on the above considerations,the paper presents two support vector machine (SVM) classification algorithm with a small number of labeled samples.One is Active Support Vector Machine Research Based on Similarity Fusion,another is the Algorithm Based on Support Vector Clustering And Classifying.(1)Active Support Vector Machine Research Based on Similarity Fusion.Active learning can actively select the most conducive to improving classifiers sample to design classifier,which effectively reduce the need for training samples,and Mark samples required price.In this paper, an active support vector machine method based on similarity fusion is to make full use of unlabeled samples, labeled samples, combined with support vector machine method to achieve active learning. Experiment shows, compared with the general ASVM,it can reduce the number of labeled samples effectively and inhibit the isolated samples of impact on the premise of keeping correctness of the classifier and it also has a higher classification accuracy in the same number of labeled samples.(2) The Algorithm Based on Support Vector Clustering And Classifying.To further enhance the prediction accuracy of support vector clustering and reduce the time and space complexity of cluster assignment,a semi-supervised machine learning algorithm based on support vector clustering and classifying(SVC-C)was proposed.The SVC-C algorithm will be clustering assignment stage replacement for support vector machine classification and was used to classify by a small number of labeled samples and a large number of unlabeled samples, which decreased a lot of manual marker workload and reduced the solving quadratic programming complexity of support vector machine classification.Experiment shows,the method has many obvious advantages for support vector clustering and support vector machine classification.
Keywords/Search Tags:Labeled samples, Machine learning, Active learning, Support vector machine Classifying, Support Vector Clustering, Similarity fusion
PDF Full Text Request
Related items