Font Size: a A A

Research On Fine-Grained Image Analysis Based On Machine Learning

Posted on:2023-05-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:1528306845951589Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The target objects of traditional image analysis belong to coarse-grained categories(e.g.,bird,car,aircraft,and flower).In real life,however,target objects usually belong to fine-grained categories of super-categories(e.g.,“Boeing 737-200” and “Boeing 737-500”are sub-categories of “Aircraft”).Fine-grained image analysis is a hot topic in computer vision and pattern recognition,and its goal is to identify,detect,and segment fine-grained objects in images.However,fine-grained image analysis is more challenging than traditional image analysis due to large intra-class variations and small inter-class variations,lack of fine-grained datasets under complex scenes,and difficulty in obtaining fine-grained segmentation datasets.This dissertation focuses on the recognition and segmentation tasks in fine-grained image analysis,and the research contents are summarized as follows.(1)Fine-grained image recognition in simple scenes based on data characteristics.Objects in most of the existing fine-grained datasets(e.g.,Swedish,ICL,Orchid,CUB-200-2011,and Stanford Cars)tend to occupy a significantly larger portion of the image and appear in relatively clear backgrounds.Such datasets are called fine-grained image datasets with simple scenes.Traditional fine-grained image recognition methods usually verify the algorithm performance on Swedish and ICL leaf datasets.According to the data characteristics of Swedish and ICL datasets,which are dominated by the pure white background,they often use hand-designed shape feature descriptors to represent the leaves.However,these methods are usually difficult to extract both local details and global information of leaves at the same time,making it difficult for them to distinguish leaf species with small inter-class variations.A global shape feature descriptor and a margin feature descriptor are designed to capture global information and local details of leaves.Experimental results show that the proposed method achieves better classification performance than the existing traditional leaf recognition methods.Besides,the hierarchical structure of category labels is a major feature of fine-grained data.However,most existing methods use labels of the same granularity level for training.This leads to ignoring the hierarchy that may help better differentiate different visual objects.The hierarchical structure encodes rich contextual information into the network structure to enhance the distinguishable of features,and it can capture multi-granularity semantic information to enhance the richness of features.To this end,a multi-task learning framework,named Hierarchical Bilinear Convolutional Neural Network,is developed by seamlessly integrating CNN with multi-task learning over the hierarchical visual concept structures.(2)Fine-grained image recognition in complex scenes based on location guidance.Most existing fine-grained image recognition methods verify the performance of the algorithms on pure fine-grained datasets,and these methods usually do not consider the case of complex scenes in their design.We introduce a new benchmark dataset for fine-grained image recognition in complex scenes which is named as AIBD-Cars.Furthermore,a novel approach is developed to enable fine-grained image recognition under complex scenes.Specifically,a novel approach,named as Automatic Learning Method(ALM),is proposed to localize objects.Then the object is cropped according to the localization result to reduce the negative influence of the background.Finally,the cropped object image is fed to a fine-grained classification network for fine-grained image recognition.Experimental results show that compared with existing weakly-supervised object detection methods,the ALM has better detection accuracy,and its detection performance is comparable to fully-supervised object detection methods on multiple datasets.Meanwhile,the proposed method achieves better performance than existing fine-grained classification methods on the AIBD-Cars dataset.(3)Automatic generation of pixel-level labels for fine-grained semantic segmentation.Most existing methods of semantic segmentation focus on generic images,which rarely consider how to effectively segment the fine-grained images category.The main reason is the lack of fine-grained semantic segmentation datasets.Manually labeling large-scale training images at the pixel level is dreadfully labor-intensive and it could also be very difficult for humans to provide consistent labeling quality.To this end,we focus on researching a new approach that can generate object masks with detailed pixel-level structures/boundaries automatically to enable semantic image segmentation of thousands of targets in the real world without manually labeling.Specially,we first collect images containing the target category from the public dataset or Google Image Search engine.These datasets are used to pre-train our proposed guided filter network.Then,the pre-trained guided filter network will further carry out iterative learning on unlabeled target data.The proposed method has better segmentation performance than existing weakly-supervised,semi-supervised,and domain adaptation methods under the same experimental conditions,and the generated pixel-level labels are comparable to human-annotated labels.(4)Class guided channel weighting network for fine-grained semantic segmentation.Based on the automatically generated pixel-level labels,a new approach,named Class Guided Channel Weighting Network,is developed to enable fine-grained semantic segmentation.Fine-grained image semantic segmentation mainly suffers from large intra-class variations and small inter-class variations.For the large intra-class variations,we propose a Class Guided Weighting module,which learns the image-level fine-grained category probabilities by exploiting second-order feature statistics,and use them as global information to guide semantic segmentation.For the high similarity between different sub-categories,we specially build a Channel Relationship Attention module to amplify the distinction of features.Furthermore,a Detail Enhanced Guided Filter module is proposed to refine the boundaries of object masks by using an edge contour cue extracted from the enhanced original image.Experimental results show that the proposed method achieves state-of-the-art results on six fine-grained image segmentation datasets.
Keywords/Search Tags:Convolutional neural networks, Fine-grained image recognition, Object detection, Fine-grained semantic segmentation, Guided filter
PDF Full Text Request
Related items