Font Size: a A A

The Research On Fine-grained Image Recognition Based On Two-stream Deep Reinforcement Learning

Posted on:2021-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z TangFull Text:PDF
GTID:2518306107952769Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Fine-grained image recognition is an important and challenging research area in computer vision.Compared with coarse-grained classification of "bird" and "dog",fine-grained image recognition is a more detailed distinction between "parrot" and "magpie".Fine-grained image recognition has very wide use requirements and application scenarios in industry and academia.The characteristic of fine-grained images is that the differences between categories are very small.Generally,only small local areas can be used to distinguish different categories.Many existing methods try to learn how to find discriminative regions,and then implement the recognition by cropping and enlarging these local regions.Although encouraging performance has been achieved,there are still some problems in this process.Firstly,these methods all locate relatively suitable regions in the texture domain only without evaluating image information in shape domain.Thus large amount of valuable information is ignored.Secondly,the number of discriminative areas is specified in advance,that do not adapt with the content of images which limits the effectiveness and flexibility of model.These problems will lead to inaccurate recognition performance.In order to solve the above problems,this paper proposes a “Two-stream Deep Reinforcement Learning(TDRL)” framework for fine-grained image recognition.The main contributions are summarized as follows:(1)A method of combining information both in shape and texture domain for fine-grained image recognition is proposed.This method can effectively combine the two domains of shape and texture to make more reasonable predictions.In order to train a model for shape domain,we use image transformation to erase the texture information of the original images to generate a new dataset that mainly contains shape information corresponding to the original dataset.Using two paired datasets,the model learns texture and shape information jointly.(2)A method is proposed to find the multiple optimal discriminative regions using reinforcement learning.This method is very suitable for modeling discrete-time sequential decision-making processes.In this DRL(Deep Reinforce Learning)process,a series of Action,State,Reward,and Policy are defined.The use of deep reinforcement learning solves the problem of locating discriminative regions in the image.Finally,the above two methods are integrated together.This framework can use reinforcement learning to find the most suitable discriminative regions in the shape and texture domains respectively,and finally combine the two domains to obtain the prediction result.The recognition effect of this method on the CUB-200-2011 dataset reaches 87.95%,which is 2.33% higher than the basic model(Res Net50).
Keywords/Search Tags:fine-grained image recognition, discriminative regions, convolution neural network, deep reinforcement learning
PDF Full Text Request
Related items