Font Size: a A A

Research On Multi-instance Multi-label Active Learning Algorithms

Posted on:2017-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:J L LiFull Text:PDF
GTID:2308330485969623Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and the rapid spread of digital products, the text and image information in the web pages grows explosively. The structural complexity and data size also continues to increase. This data often doesn’t have the only semantic meaning, but also multiple semantic meanings and ambiguous. This ambiguity makes a single semantic learning framework difficult to obtain a good result. Multi-instance multi-label learning framework is proposed to solve the ambiguity problem in the real-world applications. These real-world applications can be modeled by multi-instance multi-label learning framework. In recent years, multi-instance multi-label learning has become a new hotspot in the field of machine learning research.In a mass amount of data, only a small amount of data is with the tag and description. For the large amount of data without labels, active learning can learn a classification model from a small number of labeled samples and a large number of unlabeled samples. According to some selection strategies, active learning can iteratively selected the most valuable unlabeled data. This selected sample is then labeled and placed to the training set for training, which effectively reduces the costs of labeling training samples. The classifier can achieve higer classification accuracy and the classification performance of can be improved with fewer training samples.In this thesis, active learning is applied to multi-instance multi-label data for the first time, and an active learning framework based on multi-instance multi-label data is proposed. Firstly, the thesis introduces the related multi-instance multi-label learning algorithms and active learning strategies. Considering the characteristics of multi-instance multi-label data, we make the multi-instance multi-label problem degenerate to a number of multi-instance single-label problems. For each problem, we put forward an evaluation criterion based on the labeled samples and unlabeled samples. According to the characteristics of multi-instance single-label learning, an active learning strategy is proposed on the multi-instance single-label data. Specifically, we propose the label minimum classification distance and label average classification distance, and design four different active learning strategies to select the most valuable unlabeled data. Finally, we apply these models in natural scene image classification and text classification. Experimental results show that the porposed multi-instance multi-label active learning algorithms can obtain significantly better performance than the random selection method.
Keywords/Search Tags:Active Learning, Multi-instance Multi-label Learning, Natural Scene Image Classification, Text Classification
PDF Full Text Request
Related items