Font Size: a A A

The Research On Collaboration-training Algorithm And Its Application

Posted on:2014-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:J L XuFull Text:PDF
GTID:2248330398958293Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Since the computer was invented, people always wanted to know if they can learn bythemselves. How to make a computer simulation human’s learning behavior, to acquire newknowledge and skills, optimize the existing knowledge structure and improve its performanceconstantly, has become one of the problems people most concerned and researched on.Machine learning is an important branch areas of artificial intelligence, it’s results have greatrealistic meaning. It is widely used in expert systems, automated reasoning, natural languageunderstanding, pattern recognition, computer vision, intelligent robots and other fields.The classic machine learning algorithms can be divided into three kinds, supervisedlearning, unsupervised learning and semi-supervised learning. In the supervised learning, wethrough learning the characteristics of given data set and corresponding results to establishpredictive models, it is used to predict the results of unknown data. But when the trainingexample is insufficient, classifiers which are trained are very lack of generalization ability. Inthe unsupervised learning, we only have a large number of data features, while nocorresponding results. We can only use the given sample information clustering the sampleson the overall. This leads to the unsupervised learning difficult to achieve a higher accuracyrate. But in reality, we are difficult to obtain enough number of labeled samples. In this case,using the conventional machine learning strategy is difficult to obtain a classifier with enoughgeneralization ability and classification accuracy at the same time.Semi-supervised learning is a machine learning strategy which proposed in recentyears.It makes up the defects of supervised learning and unsupervised learning.According towork mode of the semi-supervised learning algorithm,the semi-supervised learning algorithmcan be divided into three kinds: expectation-maximization algorithm,method based on graphand co-training.These algorithms use labeled samples and unlabeled samplessimultaneously,they can improve learning performance using unlabeled samples.This papermainly introduces the co-training algorithm in a semi-supervised learning,and use it.Co-training algorithm has the characteristics of easy to understand, stable and fastconvergence.It suitable to solve the classification problems of multi-dimensional data. Theco-training algorithm is a classification algorithm with higher artistic value.In this paper, the standard collaborative training algorithm is introduced firstly, andimproved in some details. And then, a semi-supervised neural network algorithm is proposed.It uses a few labeled to train initial neural network classifiers and makes use of the largenumber unlabeled data to promote the classifier iteratively. Experiments on UCI show thatusing the tri-training can improve the classification accuracy of neural network algorithmobviously. At the same time, through using different training functions of neural network, thealgorithm reinforced the differences between classifiers. It further improve the performance ofco-training algorithm,and solve the problem of difference is small between differentclassifiers in co-training algorithm.The simulation experiments show that the improvedalgorithm retained the excellent characteristics of the co-training,and the final classificationeffect is better.When using SVM to solve the classification problems, the number of trainsamples determines the dimension of computation directly.When solving large-scale dataclassification problems,SVM have a high time-consuming and a low classificationinefficiency.A coordinate support vector machine algorithm based on data division proposedin this paper. This algorithm divide the large-scale data into several small data sets,thosesmall data sets are mutually redundant.Classifiers are obtained by SVM training on eachsmall data set, the final classification result is obtained by using the strategy of co-training.Atthe same time, through using different kernel functions of SVM, the algorithm reinforced thedifferences between classifiers.On one hand, it enhances the efficiency of SVMalgorithm,on the other hand, it improves the performance of co-training algorithm.Thecontrast test prove that the collaborative SVM algorithm based on data division has a higherclassification efficiency and higher classification accuracy.
Keywords/Search Tags:Semi-supervised Learning, Collaboration-training Algorithm, Neural network, Support vector machines, Tri-training Algorithm
PDF Full Text Request
Related items