Visual Question Answering Model Based On Answer Type Prediction

Posted on:2021-04-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y P Jia

Full Text:PDF

GTID:2428330611998202

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recent years,as one of the important signs of artificial intelligence technology,question answering system has been widely concerned by the industry and academia.For example,personal assistant,intelligent customer service and other application examples not only improve the user stickiness,but also help enterprises reduce the labor cost,laying a solid foundation for the research of question answering system.With the continuous development of computer technology,people are no longer satisfied with the voice,text as the carrier of communication,multimodal question answering system has become a new research hotspot.As a typical multimodal question answering system,visual question answering task has been widely concerned by researchers at home and abroad.The main goal of this task is to correctly answer the questions about the relevant pictures.Although the current visual question answering model has achieved good performance,there are still some questions that are given an irrelevant answer,such as the question 's type is color,but the answer's type is number.The occurrence of the situation seriously reduces the reliability of the visual question answering model.In this paper,the prediction of answer type is taken as the starting point.First,the prediction of answer type is carried out according to the proposed questions.After obtaining the corresponding category information,it is integrated into the visual question answering model,so as to reduce the occurrence of mismatch between question and answer type,and improve the reliability and accuracy of the model.The main research work of this paper is as follows.(1)The construction of answer type prediction model.Since the types of question and answer pairs in the visual question and answer data set are obviously different,and no corresponding label is given,so the data set will be labeled first.Using the long short-term memory network and other deep learning technologies to build the answer type prediction model,extract the text feature information of the question,and classify it to obtain the final classification information.(2)Visual question answering system based on deep learning technology.Using convolutional Neural Networks,Recurrent Neural Network,attention mechanism and other deep learning technology to build a visual question answering model.As a whole,seq2 seq architecture is adopted,computer vision technologies such as resnet and object detection are used to mine image information,LSTM and other networks are used to mine question text features,and multiple multimodal fusion technologies are used to fuse image a nd question,and finally the answer to the question is obtained.The model has achieved good results in the final comparative experiment.(3)Visual question answering model based on answer type prediction.Based on the above two parts,integrate the answe r type prediction model with visual question answering model,and integrate the answer type information into the answer generation process by modifying the attention mechanism and the third modal fusion,so as to guide the generation of the overall model a nswer and reduce the occurrence of mismatch between answer and question type.The accuracy of the final model and the overall performance are improved,which is in line with the expected results of this study.

Keywords/Search Tags:

Visual question answering, Multimodal, Answer type, Object detection, Modal fusion

PDF Full Text Request

Related items

1	Research On Visual Question Answering Method Based On Answer Mask
2	Research On Visual Question Answering Based On Modal Interaction
3	Multi-modal Information Fusion In Visual Question Answering
4	Research On Visual Question Answering Based On Knowledge Graph And Answer Space Optimization
5	Research And Application Of Multi-domain Visual Question Answering System Based On Image Comprehension
6	Research On Visual Question Answer Algorithm Based On Attention Mechanism
7	Research On Question-type Sensitive Answer Summarization In Community Question Answering
8	Research On Multimodal Fusion For Visual Question Answering
9	Research On Intelligent Question Answering Technology Of Ship Navigation Based On External Knowledge
10	Research On Visual Question Answering Based On Deep Neural Network