Font Size: a A A

Research On Semi-supervised Support Vector Machine Learning Algorithsm

Posted on:2011-08-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:1118330332459895Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Support vector machine as a novel machine learning method aimed at small samples is developed by Vapnik and others in the basis of statistical learning theory. Support vector machines are widely researched and applied for its advantage of strong generalization ability and convenient for high dimension data operation recently. Even though, the traditional classify methods based on the supervised learning can resolve many actual problems effectively, it have to label mass unlabeled data in order to get enough training samples. It makes these methods costly and low efficiency. So the classify methods based on the semi-supervised learning are proposed according to the actual requirement. These methods can classify with the mixed set of labeled and unlabeled data automatically (or semi-automatic), improving efficiency while expanding the application scope of the algorithm. However, the semi-supervised support vector is a novel theory in the fields of machine learning, which in many respects, is not yet mature, imperfect, and the need for further study and improvement. In this thesis, semi-supervised support vector machine algorithms are studied from two-classification learning algorithm, benchmark learning algorithm and multi-classification learning algorithm, which fully improve the S3VMs'strength and potential.Semi-supervised learning algorithm based on least square support vector machine (SLS-SVM) is proposed aimed at S3VM algorithm computing costly and complex firstly. SLS-SVM inspired by the thought of LS-SVM, as a learning model, with the combination of semi-supervised learning thinking, using the advantage of LS-SVM which has fast training speed and high efficiency. Area labeling principle is used to label the unlabeled samples iteratively. SLS-SVM algorithm is trained on a set of both the labeled and semi-labeled data in the iterative process. The experiments on artificial and real datasets shows that SLS-SVM's training accuracy is higher than the standard SVM training algorithm, which has to a certain extent, reflects the advantage of semi-supervised learning using unlabelled training samples. Improved learning algorithm for branch and bound for semi-supervised support vector machines is proposed secondly, according to the greater difference in the optimal solution in different semi-supervised support vector machines for the same data set caused by the local optimization. The lower bound of node in IBBS3VM algorithm is re-defined, which will be pseudo-dual function value as the lower bound of node to avoid the large amount of calculation of 0-1 quadratic programming, reducing the lower bound of each node calculate the time complexity; at the same time, in determining the branch nodes, only based on the credibility of the unlabeled samples without the need to repeatedly carry out the training of support vector machines to enhance the training speed of the algorithm. Simulation analysis shows that IBBS3VM presented in this thesis has faster training speed than BBS3VM algorithms, higher precision and stronger robustness than the other semi-supervised support vector machines. Parallel Branch and Bound Semi-Supervised Support Vector Machines algorithm is presented in order to expand the scope of the IBBS3VM. Simulation results show that PBBS3VM has good speedup and improvement in learning efficiency.In order to solve less labeled data learning, difficulties in the implementation and poor results of semi-supervised multi-classification, which full use the distribution of information in of non-target samples, Semi-supervised Support Vector Data Description multi-classification algorithm is presented finally,. S3VDD-MC algorithm defines the degree of membership of non-target samples, in order to get the non-target samples'accepted labels or refused labels, on this basis, several super-spheres constructed, a k-classification problem is transformed into k SVDDs problem. The simulation results verify the effectiveness of the algorithm.
Keywords/Search Tags:Semi-supervised Learning, Statistical Learning Theory, Support Vector Machines, Branch-and-Bound, Support Vector Data Description, Multi-classification
PDF Full Text Request
Related items