Research On Generative Question Answering System Based On Multimodal Information Fusion

Posted on:2021-02-13

Degree:Master

Type:Thesis

Country:China

Candidate:W X Liao

Full Text:PDF

GTID:2428330611467593

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The artificial intelligence boom caused by deep learning has inspired researchers to explore the question answering system through deep learning.Question answering system,as an important way of human-computer interaction,enables machines to communicate with people through human language.Since the information in the real world usually contains multiple modalities,such as video,audio,text,etc.;most of the previous research on question answering system is directed to a single modal of structured data,text or image.The question answering system based on a single modality is difficult to integrate various sources of information,and it is easy to deviate from the actual question answering scene in the understanding of questions and the generation of natural language.The establishment of a multimodal question and answer model that can process and correlate multimodal features is helpful for the interpretation and reasoning of multimodal information.The main challenge of the multimodal question answering system lies in the interactive modeling between the various modal features and questions.Due to the semantic gap between different modal information and problems,it is difficult to use the general Seq2 seq model for natural language generation of responses.This paper proposes a Multimodal Attention mechanism based Question Answering System(Mm＿Att＿QA).Mm＿Att＿QA mainly includes Encoder,scene description,Decoder three modules.(1)Encoder module: The role of the Encoder module is to extract and encode features of video,audio,historical interaction records and current problems.For video,use the migrated I3 D model for feature extraction;for audio,use the migrated VGGish for feature extraction;for historical interaction records and current problems,use word2 vec and bidirectional LSTM for feature extraction.(2)Scene description module: In order to weaken the impact of the semantic gap between video,audio and text on the performance of the question answering system.The scene description module generates the text information of the scene description from the features of audio and video through supervised learning,which is helpful to promote the fusion of video and audio features to generate the final reply.(3)Decoder module: The goal of the Decoder module is to integrate various modalfeatures through a multimodal attention mechanism according to the current problem input to generate a reply.In order to find the features related to the problem,first of all,each feature is associated with the current problem.When generating each word in the reply text,the multimodal attention mechanism is used to find the strongly associated features in each modal information.In order to balance the proportion of scenario description tasks and reply generation tasks,this paper proposes a composite loss function.The experimental results show that the multimodal information fusion generated question answering system proposed in this paper is superior to other benchmark models in multiple evaluation indicators;and the influencing factors of the model are discussed and analyzed in detail.

Keywords/Search Tags:

multimodality, attention mechanism, question answering system, natural language generation

PDF Full Text Request

Related items

1	Natural Answer Generation With Attention Over Multi-instances
2	Research On Question Answering System Based On Attention Mechanism And Answer Verification
3	Research On Collaborative Attention Model And Deep Correlated Networks For Visual Question Answer
4	Research On Visual Question Answering Method Based On Attention Mechanism
5	Spatio-Temporal Attention Networks For Video Question Answering
6	Generation Of Paraphrased Questions In Retrieval Question Answering System
7	Research On Single-fact Knowledge Base Question Answering Based On Multi-aspect Attention Mechanism
8	Research And Implementation Of Generative Question Answering System Technology
9	Research On Intelligent Question Answering System Based On Deep Learning
10	Visual Question Answering Based On Object Relationship Modeling And Attention Mechanisms