Research On Visual Question Answering Technology Based On Knowledge Base

Posted on:2021-04-16

Degree:Master

Type:Thesis

Country:China

Candidate:X B Chen

Full Text:PDF

GTID:2428330623967893

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Visual Question Answering(VQA)is an artificial intelligence task that outputs an answer to a question given a picture and a related natural language question.Compared with other tasks,VQA is closer to General Artificial Intelligence(GAI).Therefore,the research of VQA model has high research value and promising application scenarios.Ac-cording to whether the knowledge base is introduced,the existing models are divided into joint embedding models and knowledge base-based models.These two types of models have good performance in VQA tasks.However,the mainstream joint embedding model has the defects of data set dependence,small network capacity and insuff-icient text rep-resentation ability.On the other hand,by introducing an external knowledge base,the knowledge base-based model overcomes the network capacity limitation of the joint em-bedding model and can answer inference questions involving common sense or external knowledge.However,it needs to construct knowledge base query statements manually,which greatly limits the generalization ability of the model.This paper improves the text representation method of the j oint embedding model and the generality of the model based on the knowledge base,mainly including the following:1)Introduce dynamic word embeddings to improve the text characterization method of the joint embedding model.The current text embedding method of the joint embed-ding model still uses the static word embedding method.Considering that the static word vector cannot effectively represent the polysemy and multi-word,our paper introduces dynamic word embeddings to the VQA model,combining Faster R-CNN and attention mechanism,proposed a joint embedding model(N-KBSN)based on dynamic word em-beddings.The experimental results prove that the dynamic word embedding can achieve better text feature representation,thereby improving accuracy.2)Construct a knowledge base graph embedding module to extend the versatility of knowledge-based models.The knowledge base graph embedding module constructed in this paper extracts core entities from images and text,and maps them as knowledge base entities,then extracts the sub-graphs closely related to the core entities,and converts the sub-graphs into low-dimensional vectors to realize sub-graph embedding.In order to achieve good subgraph embedding,we first extracted two experimental knowledge bases with rich semantics from DBpedia:DBV and DBA.Based on these two knowledge bases,a series of knowledge base embedding models are selected to produce link prediction.The results show that there is a clear correspondence between the entities of the DBV,which can achieve excellent node embedding.And the TransE model can achieve a good knowledge base embedding,so we built the knowledge base graph embedding module based on TransE.3)Merge the knowledge base graph embedding module and the N-KBSN model,and construct a VQA model(KBSN)based on the knowledge base graph embedding.Ex-perimental results on multiple data sets prove that the knowledge base graph embedding module improves the accuracy of VQA.The accuracy improves significantly while pro-cessing complex problems that require common sense or external knowledge.

Keywords/Search Tags:

Visual Question Answering, Joint Embedding Model, Knowledge Base, N-KBSN, KBSN

PDF Full Text Request

Related items

1	Research On Natural Language Question Answering Based On Knowledge Bases
2	Research And Implement For Question Answering Based On Deep Learning And Knowledge Graph Embedding
3	Research On Knowledge Base Question Answering Based On Multi-level Entity Labeling And Semantic Enhanced Representation
4	Research On Visual Question Answering Method With Visual Content Understanding And Text Information Analysis
5	Research On Automatic Question Answering System Based On Large Scale Chinese Knowledge Base
6	Question Understanding Based On Graph Matching In Question Answering Over Knowledge Base
7	Research On Knowledge Base Question Answering Method Based On BI-LSTM-CRF Model
8	Design And Implementation Of Visual Question Answering System Based On Knowledge Graph
9	Research And Design Of Constructing Knowledge Base In The General Knowledge Question Answering System
10	Research On Question Answering Based On Open Domain Knowledge Base