Font Size: a A A

Research On Automatic Question Answering Based On Large-scale Knowledge Graph

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y J WuFull Text:PDF
GTID:2428330623481445Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the construction of large-scale knowledge bases,knowledge base question an-swering(KBQA)has become a research hotspot in the field of natural language process-ing.This task aims to answer users' questions with triples in the knowledge base,so that users can efficiently and accurately acquire knowledge.However,knowledge base ques-tion answering is challenging due to the large scale of knowledge base and the heterogene-ity of question and answer data.Most existing research work has focused on modeling the local matching between the question and the element of the triples in stages to select an-swers.However,this ignores the correlation between different matching subtasks,which is prone to error propagation problems.In addition,due to the lack of large-scale unbiased labeling data,the model will perform poorly when answering facts with the infrequent re-lations or unseen relations.Therefore,this thesis uses deep learning methods to adapt the model,and then uses external data and inverse tasks to improve the model.The main contributions of this thesis are as follows·Subgraph retrieval based on sequence labeling model It is impractical to use the entire knowledge base as a candidate answer set,so we need to retrieve the subgraphs related to the question to reduce the search space of the model.In this thesis,we use the Bi-LSTM based sequence labeling model to identify the entity mention in the question,and then design a heuristic algorithm to match the candidate subjects in the knowledge base.The algorithm combines extended matching with prior knowledge and fuzzy matching based on Jaccard distance.It aims to modify the error at this stage,thereby expanding the entity recall and generating a high-quality candidate answer set·Candidate reranking model based on multi-task learning In the candidate an-swer reranking stage,this thesis improves the existing question-answer matching framework to alleviate the error propagation problem.First,we build a j oint match-ing model with a shared encoding layer based on multi-task learning,which makes the subject matching subtask and relation matching subtask to be learned together Second,we designed a symmetric complementary attention mechanism module in this model,which aims to capture the association information between two subtasks and distinguish the semantic representation of the question in different subtasks Experimental results show that the model can improve the overall task and each subtask effectively·Model enhancement incorporating external text and inverse tasks To alleviate the problem of lack of labeled data,this thesis explores two methods of model en-hancement.One is to adapt the model to integrate external text data,which mainly includes text-aware relation encoding module and subgraph-aware subject encoding module so that the element of the triple gets a more accurate encoding representa-tion when considering the contextual information.The other is to use a knowledge based question generation model to generate the question-answer pairs for data en-hancement.And then fine-tune the pre-trained question answering model.Finally,the experiment analyzes the effectiveness of the two enhance methods.
Keywords/Search Tags:question answering, knowledge based question answering, deep learning, semantic matching, multi-task learning
PDF Full Text Request
Related items