Visual Question Answering Of Sport Scenes Based On Graph Neural Networks

Posted on:2021-03-12

Degree:Master

Type:Thesis

Country:China

Candidate:J L Wei

Full Text:PDF

GTID:2428330611982785

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

The rapid development of related technologies in computer vision and natural language processing has greatly promoted the study of downstream cross-tasks,such as visual question answering(VQA).The goal of VQA is to predict the answer according to the given image and the corresponding natural language question.Compared with static images,dynamic images represented by sport scene have deep semantic information such as action,state and trend,so they are of great research value.The current research mainly explores the visual image information,but ignores the importance of the relationship between words in a question to correctly predict the answer.Therefore,this thesis proposes that the relationship between objects in an image as well as the relationship between words in a problem should be captured simultaneously.Firstly,dynamic images represented by sport scene are constructed to explore the deep semantic information.In this thesis,a two-channel VQA of self-attention is constructed by using the attention mechanism.This benchmark model is used to verify the importance of the relationship between words in a question to correctly predict the answer.Then,the relationship between objects in an image and the relationship between words in a question are captured by using the graph neural network.In this thesis,the VQA models of dual channel graph attention network,dual channel graph convolutional network,and dual channel attention-weighted graph convolution network are designed respectively.In this thesis,a large number of experiments,including comparison experiments,ablation experiments and visualization analysis,have been conducted.The results show that capturing the relationship between objects and words simultaneously is helpful to improve the performance of VQA model,which verifies the effectiveness of the method proposed in the thesis.

Keywords/Search Tags:

visual question answering, attention mechanism, graph attention network, graph convolutional network

PDF Full Text Request

Related items

1	Video Question Answering Based On Attention Mechanism And Graph Convolutional Network
2	Question-Guided Attention Reasoning Mechanism For Visual Question Answering
3	Research On Visual Question Answering Based On Visual Attention
4	Attention Mechanism And High-level Semantics For Visual Question Answering
5	Research On Visual Question Answering Based On Deep Learning
6	Research On Question Answering Over Knowledge Bases With External Texts Based On Graph Attention Network
7	Research On Visual Question Answering Based On Deep Neural Network And Attention Mechanism
8	Deep Convolutional Network And Regional Attention Network For Visual Question Answering
9	Research On Situational Reasoning Question Answer Method Based On Deep Learning
10	Research And Implementation Of VQA Based On Priori Attention Mechanism