Research On Situational Reasoning Visual Question Answering Based On Graph Neural Network

Posted on:2023-12-14

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Guo

Full Text:PDF

GTID:2568306830961379

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Visual Question Answering is a challenging frontier research direction,the task is to generate and output natural language answers by taking images and image-related natural language questions as input,infering the semantic meaning of the questions and the content of the pictures.The existing VQA model is not accurate enough to judge the word in question,the relationship between the objects in the image and the relationship between the problem and the image,and its reasoning ability also needs to be improved.To solve these problems,a visual question and answer reasoning model based on graph neural network is proposed.Firstly,in order to solve the problem of weak word relevance,a problem encoder is constructed,which takes continuous words as input and feeds them into BERT network based on self-attention mechanism to form problem features,so that word relevance within the problem can be paid attention to.Secondly,aiming at the problem of weak compactness of image content,the network model based on Faster-RCNN is adopted to solve the problem of region discontinuity.The target features are extracted from the graph,and the object regions with high relevance and score are selected according to the proposed questions.Finally,in order to improve the connection and reasoning ability between image problems,the filtered features are regarded as the initial nodes of the graph.Combining the questions and background information,the nodes are updated through the graph attention network,so that the improved nodes can learn the relationship between objects more completely and accurately.The final node passes through multi-layer perceptron and activation function to complete the answer prediction.In the data set of CLEVR and GQA,most evaluation indexes of the model were better than other models.Compared with the model with the second highest performance in experimental comparison,the overall accuracy rate increased by 0.10% and 2.59%.Compared with the baseline model,the increase of 46.70 and 27.72 percentage points respectively proves the effectiveness of the proposed method.

Keywords/Search Tags:

Visual Question Answering, Scene Graph, Relationship Reasoning, Graph Neural Network, Attention Mechanism

PDF Full Text Request

Related items

1	Visual Question Answering Of Sport Scenes Based On Graph Neural Networks
2	Question-Guided Attention Reasoning Mechanism For Visual Question Answering
3	Research On Visual Question-Answering Methods Based On Attention Mechanism
4	Visual Question Answering Based On Deep Reasoning
5	Research On Visual Question Answering Method Based On Graph Neural Network
6	Research On Situational Reasoning Question Answer Method Based On Deep Learning
7	Research And Implementation Of Scene Graph Generation Algorithm Based On Attention Mechanism
8	Research On Visual Question Answering Based On Graph Neural Networks And Attention Mechanisms
9	Object-oriented Two-Stream Network And Heterogeneous Graph Reasoning On Video Question Answering
10	Visual Question Answering Based On Object Relationship Modeling And Attention Mechanisms