Font Size: a A A

Research On Semi-supervised Learning Classification Algorithm

Posted on:2018-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ChenFull Text:PDF
GTID:2348330533958792Subject:Agricultural Electrification and Automation
Abstract/Summary:PDF Full Text Request
Machine learning has become an important way for computers to acquire knowledge and an important indicator of artificial intelligence.Traditional machine learning techniques require the use of a large number of labeled samples for training,however,in many practical applications,it is difficult to obtain a large number of labeled samples,and unlabeled samples are much easier.Therefore,the semi-supervised learning method which needs only a small number of labeled samples has attracted great attention in the field of pattern recognition and machine learning.This paper mainly focuses on the clustering and classification of semi-supervised learning,the main work is as follows:Based on the idea of co-training theory of semi-supervised,a support vector machine classification algorithm combine with co-training is proposed.The algorithm obtains the information in the labeled samples through two different SVM classifiers and then use them to predict the unlabeled samples respectively.In the meantime,the mutual verification method is used to filter the prediction with high confidence and update the classifiers according to expanded labeled samples.This method simplifies the learning process while ensuring the recognition accuracy.In this paper,the UCI dataset is used and combine with the DAG-SVMs multi-classification strategy to prove that the algorithm has high classification accuracy in the case of less labeled samples.Finally,the algorithm is applied to the classification of pupylation sites in prokaryotic proteins,and the effect is excellent.And then focus on the problem of the semi-supervised learning cannot effectively correct the leaner cause by the initial labeled sample is too small,a self-training SVM classification algorithm based on clustering analysis is proposed.The semi-supervised fuzzy c-means clustering algorithm is used to extract the whole sample structure in this algorithm,then the self-training SVM is used to achieve sample classification.This algorithm can reduce the error probability by using the secondary screening method.Considering the special of time series,a supervised reconstruction algorithm is proposed based on structural learning principle to realize the dimensionality reduction and feature extraction of the original time series.Finally,the effectiveness of the algorithm is proved by UCR dataset,and the algorithm is applied to the edge effect detection in chemical cytotoxicity assessment experiment,and good results are obtained.
Keywords/Search Tags:Semi-supervised learning, Data driven, Self-training, Co-training
PDF Full Text Request
Related items