Font Size: a A A

Research On Deep Learning Based Fine-grained Visual Representations

Posted on:2022-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:C GuoFull Text:PDF
GTID:2518306773467934Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In traditional image analysis research,the objects usually present obvious discrimination,such as distinguishing person from car or bird.There are often obvious differences between these objects that can be easily identified by ordinary person.In real world,however,we often face the challenge of distinguish fine-grained categories of a coarse-grained class,such as Caspian Tern and Foster Tern among birds.In general,fine-grained categories have little intra-class variation,and large inter-class variation due to perspectives,backgrounds,and other objective factors.Therefore,this task is challenging and requires corresponding expert knowledge.Fine-grained visual representation learning aims to utilize machine algorithms to learn the discriminative features between similar categories for fine-grained visual recognition,detection and retrieval tasks,etc.By analyzing the characteristics of fine-grained visual tasks,the thesis explores fine-grained visual representation learning based on deep neural network.The main work includes:· A simple approach is proposed,which is designed for fine-grained visual recognition tasks.Previous fine-grained visual recognition models are not considered from the perspective of data augmentation and data diversity.We propose a data enhancement method called Attentive Cutout based on inverse transformation sampling.By simulating the region response distribution of a random channel feature map,the region with high response value is sampled as attention prior,and this region is cutout and used as the new training data.This method drives the model to pay attention not only to the most important part,but also to other parts to improve the performance of the base model.Experimental results on four benchmarks show that the method is simple and competitive.· An effective method is proposed,progressively sample discriminative parts for finegrained visual recognition.Previous methods usually introduced trainable parameters and computation consumption because of region proposal subnet or region search to locate discriminating parts.To alleviate the drawback,a from whole to details multi-stream method for fine-grained visual recognition is proposed.Firstly,the object is located by aggregating feature activation map,and then discriminative parts are extracted iteratively from the feature map by a progressive mechanism,and a three-stream framework is constructed by connecting original images finally.The computation cost of object and parts localization is few,and no additional training parameters are introduced.Compared with similar methods,experimental results show that the proposed method can effectively extract discriminating parts,and improve the efficiency and performance significantly.
Keywords/Search Tags:deep learning, fine-grained visual recognition, representation learning, attention mechanism
PDF Full Text Request
Related items