Classification is one of the important tasks in machine learning field, whose goal is totrain a classifier on the known labeled dataset and then make a prediction on the new examplewithout label using the learned classifier. In the traditional classification problems, oneexample object is represented by only one instance, and the instance belongs to just one label.However, one example is correlated with multiple labels in many real-world applications,which is defined as an ambiguity issue. It can not solve the practical issues radically if we justsimply utilize the traditional supervised single-label learning framework to address ambiguityproblems. Aiming at revealling all the semantic information implied in the ambiguous objects,one of straightforward methods is to assign appropriate label subsets to an object. Forhandling the above ambiguity matter, the multi-label framework is proposed.However, most multi-label algorithms are supervised ones. To the best knowledge of us,abundant labeled examples are required in supervised multi-label algorithms. However, it isnot only time-consuming but also laboursome to obtain large-scale labeled examples.Relatively, unlabeled examples are omnipresent and easier to be harvested in the real world.Hence, the research on semi-supervised learning on how to enhance the performance of thelearner using limited labeled examples as well as plenty of unlabeled examples is one of thecore fields in machine learning and pattern recognition.This paper is mainly focused on multi-label image classification algorithms. The maincontributions of this thesis are as follows: Firstly, multi-label learning method based onco-training inspired by the idea of semi-supervised regression is proposed in this paper.Experimental results on Scene and Yeast datasets show the feasibility and effectiveness of theproposed method. Secondly, active learning is studied as a special semi-supervised learning.Active learning online multi-label image classification algorithm is adopted to address theissues that training examples are not enough and retraining efficiency is low in multi-labelimage classification. Tests on multiple datasets verify the efficiency of the algorithm. Thirdly,ML-SSAIC (Multi-label Semi-Supervised Active Image Classification) combining semi-super-vised co-training with active learning querying the informative and representative examplesis incorporated to dispose multi-label tasks. The classifier built by ML-SSAIC could uselimited valuable examples with abundant unlabeled examples to construct classificationmodel effectively. Empirical studies on multiple real-world multi-label learning datasetsvalidate the superiority of ML-SSAIC on five common used evaluation metrics. AndML-SSAIC utilizes as few valuable examples as possible to train a classifier with lower pricebut better performance and makes full use of examples resource. |