Font Size: a A A

Active Learning For Image Classification

Posted on:2016-04-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y J GuFull Text:PDF
GTID:1108330482969739Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Image classification is a research highlight in computer vision and pattern recog-nition. It plays a significant role in intelligent traffic, security monitoring, robots’ navigation, et al. To achieve a good performance in image classification, large number of labeled images are needed to train a robust model to predict the labels of unlabeled images. But in real world applications, there are little labeled images while the unla-beled images are easy to get. Moreover, it is a difficult job to label images by hand. To reduce the cost of manual labeling, Active Learning (AL) was adopted to image classification.The main idea of active learning is as follows:given a large number of unlabeled samples, select a small number of the most informative and representative samples with some strategies for manual labeling. With the labeled samples, a model can be trained and the unlabeled samples can be predicted most precisely. The kernel technology of active learning is how to select the most informative samples to improve the model mostly.This dissertation is focused on active learning for image classification. Several active learning algorithms have been proposed and the results of the experiments demonstrate the effectiveness of the proposed algorithms.The main work and the innovation in the dissertation include the following aspects:Firstly, based on Optimal Experimental Design (OED) and the assumption of neighbor reconstruction between samples, a novel active learning algorithm called Neighborhood Preserving D-Optimal Design (NPDOD) was proposed. Classical OED algorithms are based on least-square errors over the labeled samples only while the unlabeled samples are ignored. Inspired by Locally Linear Embedding (LLE), it is assumed that each sample’s label can also be reconstructed by its neighbors’labels. NPDOD simultaneously minimizes the least-square error on the labeled samples and minimizes the neighbor reconstruction error of all samples. The samples which can minimize the variance of the model are expected to be most informative. Therefore, they are selected for labeling and training.Secondly, a dynamic programming based multi-criteria active learning algorithm was proposed. Traditional active learning only considers one criterion in samples se- lection, such as uncertainty, density, et al., while the redundancy between the samples are neglected. Without initial labeled samples, a sampling strategy with Maximum Density and Minimum Redundancy (MDMR) was proposed. With some initial labeled samples, we proposed a novel Active Learning algorithm by combining samples’ Un-certainty and Diversity (AL.UD). The samples with large uncertainty and diversity are selected for labeling. Both of the above two methods combined two criteria in active learning and the problem of samples selection was transformed into a dynamic programming problem.Besides, a Quadratic Programming (QP) and Submodular Function (SF) based multi-criteria active learning algorithm was proposed. The proposed method evaluated samples’uncertainty, density and redundancy in the process of samples selection and a sampling model was proposed. The proposed model can be approximately solved by QP and SF methods. In QP method, the Augment Lagrange Multiplier (ALM) method was adopted to find the solution more efficiently. In Submodular Function-based method, a greedy algorithm was used to solve the problem. There is a theorem about submodular function which guarantees the performance of the greedy algorithm.Lastly, a semi-supervised based active learning algorithm was proposed. Active learning usually uses the labeled samples to train classifiers while the unlabeled samples are neglected. In semi-supervised learning, the labeled samples are pre-given and may not be informative. Thus, this paper proposed to combine active learning and semi-supervised learning. Based on Learning with Local and Global Consistency (LLGC), an active learning minimize the expected classification risk on unlabeled samples was proposed.
Keywords/Search Tags:active learning, image classification, sampling strategy, optimal experi- mental design, uncertainty sampling, semi-supervised learning
PDF Full Text Request
Related items