Font Size: a A A

Research On Visual Question Answering Based On Counterfactual Samples Synthesizing

Posted on:2022-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2518306530955649Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Although Visual Question Answering(VQA)has made remarkable progress in the past few years,today's visual question answering models tend to capture the superficial linguistic correlation in the training set and cannot be generalized to test sets with different QA distributions.In order to reduce language bias,some recent work has introduced an auxiliary question-only model to standardize the training of targeted visual question answering models.CSS,as a model-independent synthetic training program of counterfactual samples,generates a large number of counterfactual training samples by masking key objects in images or words in questions and assigning different real answers.After training with supplementary samples(i.e.,original samples and generated samples),the VQA model is forced to focus on all key objects and words,which significantly improves visual interpretability and problem sensitivity.This article adds the counterfactual sample model to four current common attention models: stacked attention mechanism,parallel common attention mechanism,deep modular common attention mechanism,and MUTAN attention.By comparing the accuracy results after adding the counterfactual sample model with the results of the model without adding the counterfactual sample,judge the influence of the counterfactual sample model on the final accuracy results obtained by different attention models.In addition to the counterfactual sample model,a regular function is added to achieve the effect of improving the final accuracy.It is concluded that the counterfactual model has improved the performance of these four attention models in the VQA-CPv2 data set to varying degrees,especially in the categories of Yes/No and Number.
Keywords/Search Tags:Computer Vision, Visual Question Answering, Convolutional Neural Network, Attention Mechanism, Counterfactual Samples Synthesizing
PDF Full Text Request
Related items