Font Size: a A A

Research On Semi-supervised Multi-label Learning Algorithm Based On Tri-training Algorithm

Posted on:2015-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiuFull Text:PDF
GTID:2308330461483885Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Multi-label learning, one of the important research directions of machine learning, can reflect the variety of semantic information possessed by the ambiguity object. Its learning task is to predict the corresponding label set for the unlabeled sample. In recent years, researchers have proposed plenty of methods and strategies on multi-label learning and applied them to web document classification, image scene classification and bioinformatics and the other practical fields.However, traditional multi-label learning is still supervised learning, in which sufficient labeled samples are required. Though, in the real world, it is difficult to obtain the training sample set with sufficient labeled samples.Therefore, in view of the deficiencies described above, this paper employs Co-training mechanism and concern how to accomplish multi-label learning task. Co-training mechanism is the most important paradigm of semi-supervised learning. It is capable of comprehensive utilizing a few labeled samples and abundant unlabeled samples to improve the learning performance. The main work is summarized in three aspects as follows:(1) Under "first-order" strategy, the multi-label learning problem is decomposed into multiple binary classification problems. Combined with Tri-training algorithm to train the classifier, a semi-supervised first-order multi-label learning algorithm is exploited. This algorithm ignores the correlation between labels, and decomposes multi-label learning problem into multiple single-label learning problems. Then Tri-training process is used to train every single labels of labeled sample to obtain the corresponding three classifiers. Next, the obtained three classifiers described above are exploited to vote for every single labels of a new test sample, resulting in a set of predictions of test sample set. Experimental results on UCI datasets, web document classification datasets and natural scene classification datasets demonstrate that the proposed algorithm can get a better classification results.(2) Under "second-order" strategy, taking into account correlation between any two labels, the multi-label learning problem is transformed to label ranking problem. Combined with Tri-training algorithm to train the classifier, a semi-supervised second-order multi-label learning algorithm is exploited. In learning stage, proposed algorithm adds a virtual label to the original label set. Then Tri-training algorithm is utilized to train the corresponding classifiers for each pair of labels. In prediction stage, the obtained classifier described above is exploited to deal with a new sample. According to the votes of each label, final result is obtained by transforming the multi-label learning problem to label ranking problem, in which the vote of the virtual label is treated as a threshold. Experimental results on UCI datasets, web document classification datasets and natural scene classification datasets demonstrate that the proposed algorithm can get a better classification results.(3) On the basis of beautiful interface, friendly human-computer interaction and easily updating, a semi-supervised multi-label learning system based on Tri-training algorithm is developed by using MATLAB, C# programming language and SQL Server database. The system integrates a number of classic multi-label learning algorithms, including ML-kNN algorithm, Rank-SVM algorithm, LEAD algorithm, TRAM algorithm and proposed first-order algorithm based on Tri-training and second-order algorithm based on Tri-training. Actual operation shows that the system has a concise interface and a convenient operational environment for users to make theoretical innovation and experimental comparison.Combining Co-training mechanism, this paper provides references about how to conduct semi-supervised learning on multi-label learning effectively and explores the comprehensive performance of multi-label learning classification algorithm under different strategies.
Keywords/Search Tags:Multi-label learning, Semi-supervised learning, Co-training, Tri-training algorithm
PDF Full Text Request
Related items