Question-Guided Attention Reasoning Mechanism For Visual Question Answering

Posted on:2021-05-15

Degree:Master

Type:Thesis

Country:China

Candidate:X Wan

Full Text:PDF

GTID:2428330623468547

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The combined impact of new computing techniques with an increasing of large datasets,is transforming the research direction of many field.Techniques developed with deep learning now are being widely used in the fields of both natural language processing(NLP)and computer vision(CV).On some single-modal tasks,the performance of deep learning models even exceeds the performance of humans.Therefore,many multi-modal tasks,such as visual question answering(VQA),have attracted the attention of many researchers.Given an image and an image-related question,the VQA models need understand and fuse the information of these two modalities,and finally generate an answer.Existing approaches improve model's reasoning ability by stacking attention mechanism without considering the guiding role of the problem in answering process.Therefore,we propose a problem-guided visual reasoning cell,which uses memory to store the image information we need.First,we generate a command from problem by a command generation module.Second,a visual attention mechanism is used to extract the command-related visual regions.Third,we update memory of the cell by the extracted regions.Experimental results on VQA2.0 dataset shows that our model outperforms several fusion based techniques in VQA.Although visual attention mechanism focuses the image on a significant area,it's insufficient to understand the relationship between objects,which is often required when answering complicated question.In this paper,we use the question-guided graph attention network to capture contextual information between the objects in image.Each node,which represents an object,is updated through iterative message passing conditioned on the command extracted from command generation module.Our approach shows its superiority to attention mechanism methods on VQA2.0 dataset and GQA dataset,and outperforms several state-of-the-art techniques.

Keywords/Search Tags:

Visual Question Answering, Attention Mechanism, Graph Neural Networks

PDF Full Text Request

Related items

1	Visual Question Answering Of Sport Scenes Based On Graph Neural Networks
2	Research On Visual Question Answering Based On Visual Attention
3	Visual Question Answering Based On Deep Reasoning
4	Attention Mechanism And High-level Semantics For Visual Question Answering
5	Research On Collaborative Attention Model And Deep Correlated Networks For Visual Question Answer
6	Research On Visual Question Answering Based On Deep Neural Network And Attention Mechanism
7	Research On Visual Question Answering Based On Deep Neural Network
8	Research On Visual Question Answering Based On Text Semantic Understanding
9	Research On Visual Question Answering Method Based On Attention Mechanism
10	Visual Question Answering Based On Interpretation And Attention Mechanism