Font Size: a A A

Research On Deep Learning Based Fine-grained Image Analysis

Posted on:2019-01-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:X C WeiFull Text:PDF
GTID:1318330545475614Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In traditional computer vision research,the categories of the target objects in image analysis are usually coarse-grained categories,such as "dog","car" and "bird" How-ever,in many real-world applications,the target objects often belong to fine-grained categories which are from one common coarse-grained category.For example,there are several images belonging to "Husky",several ones belongting to "Alaska",and so on.Yet,all of these images are from the common coarse-grained category,i.e.,"dog".Fine-grained image analysis is a research direction focusing on this kinds of image tasks,which is a hot topic in computer vision and pattern recognition.The goals of fine-grained image analysis are localizing the fine-grained objects in these fine-grained images,recognizing the fine-grained object categories,retrieving the fine-grained ob-jects and so on.Fine-grained image analysis is useful and valuable in diverse appli-cations such as biological research and bio-diversity protection.But,due to the small inter-class variations caused by highly similar subordinate categories,and the large intra-class variations in poses,scales and rotations,fine-grained image analysis is chal-lenging and difficult.This dissertation studies several important issues on fine-grained image analysis,and main results are summarized as follows.1.A simple but effective approach SCDA is proposed,which is designed for the fine-grained image retrieval task.Most existing methods of content-based im-age retrieval focus on either landmark images or generic images,which do not deal with the fine-grained images category.We propose the first deep learning based fine-grained image retrieval method SCDA.By employing pre-trained deep convo-lutional neural networks,SCDA could firstly localize the main object in fine-grained images,a step that discards the noisy background and keeps useful deep descrip-tors.The selected descriptors are then aggregated and dimensionality reduced into a short feature vector using the best practices we found.Experiments confirm the effectiveness of SCDA for fine-grained image retrieval and object localization.2.A simple but effective approach DDT is proposed,which accurately locates the common fine-grained object in a set of unlabeled images.In order to further im-prove the accuracy of unsupervised fine-grained object localization,we reveal that it is better to use the information beneath an image set to perform image based ob-ject co-localization.It is significantly different from SCDA which merely uses the information from one single image.We propose the DDT method by leveraging the pre-trained classification model to extract the deep convolutional descriptors.Then,DDT evaluates the correlations of descriptors and then obtains category-consistent regions(i.e.,localizing the common objects).Empirical studies validate the ef-fectiveness of the proposed DDT method.On benchmark image co-localization datasets,DDT consistently outperforms existing state-of-the-art methods by a large margin.Moreover,DDT also demonstrates good generalization ability for unseen categories and robustness for dealing with noisy data.3.An effective approach Mask-CNN is proposed,which is designed for the fine-grained image recognition task.Most existing fine-grained image recognition methods directly used the deep convolutional descriptors and encoded them into a single representation,without evaluating the usefulness of the obtained object/part deep descriptors.We propose the Mask-CNN method for fine-grained image recog-nition.Based on the part annotations,the proposed Mask-CNN consists of a fully convolutional network to both locate the discriminative parts(e.g.,head and torso),and more importantly generate weighted object/part masks for selecting useful and meaningful convolutional descriptors.After that,a three-stream Mask-CNN model is built for aggregating the selected object-and part-level descriptors simultane-ously.Experimental study validates the advantages of Mask-CNN on both effec-tiveness and efficiency.4.A few-shot based approach PCM is proposed,which is tailored for few-shot fine-grained image recognition.Most existing deep learning based fine-grained image recognition methods must be driven by large data(i.e.,labeled fine-grained images).When data is limited,it is hard to obtain satisfactory recognition accuracy,and even fails to train sometimes.We propose the PCM method consisting of a bilinear feature learning module and a classifier mapping module:while the former encodes the discriminative information of an exemplar image into a feature vector,the latter maps the intermediate feature into the decision boundary of the novel category.We learn the exemplar-to-classifier mapping based on an auxiliary dataset in a meta-learning fashion.Experimental results validate the effectiveness of the proposed PCM method in the challenging few-shot fine-grained image recognition task.
Keywords/Search Tags:deep learning, convolutional neural networks, fine-grained images, image retrieval, image recognition, few-shot learning
PDF Full Text Request
Related items