Research On Visual Question Answering Based On Counterfactual Samples Synthesizing

Posted on:2022-08-28

Degree:Master

Type:Thesis

Country:China

Candidate:C Wang

Full Text:PDF

GTID:2518306530955649

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

Although Visual Question Answering(VQA)has made remarkable progress in the past few years,today's visual question answering models tend to capture the superficial linguistic correlation in the training set and cannot be generalized to test sets with different QA distributions.In order to reduce language bias,some recent work has introduced an auxiliary question-only model to standardize the training of targeted visual question answering models.CSS,as a model-independent synthetic training program of counterfactual samples,generates a large number of counterfactual training samples by masking key objects in images or words in questions and assigning different real answers.After training with supplementary samples(i.e.,original samples and generated samples),the VQA model is forced to focus on all key objects and words,which significantly improves visual interpretability and problem sensitivity.This article adds the counterfactual sample model to four current common attention models: stacked attention mechanism,parallel common attention mechanism,deep modular common attention mechanism,and MUTAN attention.By comparing the accuracy results after adding the counterfactual sample model with the results of the model without adding the counterfactual sample,judge the influence of the counterfactual sample model on the final accuracy results obtained by different attention models.In addition to the counterfactual sample model,a regular function is added to achieve the effect of improving the final accuracy.It is concluded that the counterfactual model has improved the performance of these four attention models in the VQA-CPv2 data set to varying degrees,especially in the categories of Yes/No and Number.

Keywords/Search Tags:

Computer Vision, Visual Question Answering, Convolutional Neural Network, Attention Mechanism, Counterfactual Samples Synthesizing

PDF Full Text Request

Related items

1	Research On Visual Question Answering Based On Deep Neural Network And Attention Mechanism
2	Research On Visual Question Answering Method Based On Attention Mechanism
3	Deep Convolutional Network And Regional Attention Network For Visual Question Answering
4	Research On Visual Question Answering Based On Visual Attention
5	Research On Visual Question Answering Method With Visual Content Understanding And Text Information Analysis
6	Research On Visual Question Answering Models Based On Top-down Attention
7	Visual Question Answering Of Sport Scenes Based On Graph Neural Networks
8	Research On Visual Question Answering System Based On Relational Reasoning Network
9	Question-Guided Attention Reasoning Mechanism For Visual Question Answering
10	Research On Visual Question Answering Based On Deep Neural Network