Font Size: a A A

Research On Classification Method Of Semi - Supervised Support Vector Machine

Posted on:2015-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ChenFull Text:PDF
GTID:2208330434951415Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Vapnik proposed the support vector machine (SVM), which is a new machine learning method and is based on statistical learning theory and structural risk minimization principles. It solves some problems, the problem of machine learning methods, such as model selection, nonlinear, and the curse of dimensionality, local minima problems and had learning problems. Thus, in recent years this method has been widely used in classification or regression problems. However, support vector machine is a supervised learning based on the traditional classification methods, the labeled samples to get costly, inefficient, and semi-supervised learning method combines the use of labeled samples, but also the use of a large number of unlabeled samples, the study personnel combined with semi-supervised learning ideas in support vector machine also proposed on the basis of semi-supervised support vector machine. However, the semi-supervised support vector machine is in the field of machine learning theory is relatively new, it is not perfect in many ways, immaturity, needs further study and improvement. This article from the semi-supervised support vector machine time and space complexity is high and can not effectively deal with large-scale data classification problems (such as image classification) for semi-supervised support vector machine launched two studies, to fully tap the semi-supervised support vector machine potential and advantages. The main work is as follows:(1) FCM pre-selected samples of the semi-supervised SVM image classification method. Based on LapSVM of semi-supervised classification method will randomly select a certain number of unlabeled samples added to the training sample set,then to obtain classification, unlabeled samples added more, the generalization ability of classifier will be stronger, but with the number of samples increase, to train classifiers required memory space and CPU usage time also will be increased dramatically, so when dealing with large-scale data classification, how to select as few unlabeled samples to ensure classifier performance while reducing algorithm time and space complexity issues degrees. Therefore, FCM(Fuzzy C Mean) pre-selected samples of the semi-supervised SVM image classification method is proposed. The method uses FCM algorithm to clustering the unlabeled samples, according to the clustering results to select unlabeled samples of near optimal separating hyper-plane add to the training sample set, these samples may be support vector carrying useful information for classification, and the number only a small part of the unlabeled samples, so the training sample set is reduced, thereby reducing the time and space complexity of the algorithm. Simulation results show that this method takes advantage of unlabeled samples inherent discrimination information, ensuring the accuracy of classifier algorithm while effectively reducing the time and space complexity.(2) Semi-supervised classification method based on Mean Map cluster kernel and least squares support vector machine. In order to use information contained in unlabeled samples to improve the classification accuracy when there are fewer labeled samples, Semi-supervised classification method based on Mean Map cluster kernel and least squares support vector machine is proposed, the method according to clustering assuming, that is, the same cluster sample are more likely to have the same principle of category labels to fix kernel function, the method using k-means clustering algorithm to cluster all the samples, according to the clustering results, structure Mean Map cluster kernel, then the Mean Map cluster kernel and existing kernel function (RBF kernel function used in this article) are combined to form a new kernel function, so that the new kernel function was included among the similarity of cluster (obtained through the Mean Map cluster kernel) and the similarity of the samples (obtained through the existing kernel function). Unlabeled sample information which will integrate the classifier, and the use of least squares support vector machine, which will be the standard support vector machines for solving quadratic programming problem is converted to solve the linear equations problem, reducing the time complexity of the algorithm, experiments show improve the proposed algorithm makes full use of unlabeled samples information, effectively improve the classifier performance.
Keywords/Search Tags:support vector machines, semi-supervised learning, least squares supportvector machine, pre-selected samples, fuzzy c-means clustering, image classification
PDF Full Text Request
Related items