Font Size: a A A

Research On Natural Language Question Answering Based On Knowledge Bases

Posted on:2018-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:C D ZhanFull Text:PDF
GTID:2348330512485655Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Knowledge base(KB)based question answering(QA)refers to the task of answering the question in the form of natural language using structured KBs,and it is one of the important research directions of natural language processing.The main methods of KB based QA can be divided into three categories:information extraction based method,semantic parsing based method and vector space modeling based method.The key technologies include knowledge extraction and representation,semantic representation of user's question and answer generation based on KB and so on.The performance of the KB based QA system,which is influenced by the accuracy of the question's semantic representation,the scale of the question answer pairs as the training data and so on,still needs further improvement at current stage.In addition,the lack of open-source large-scale open-domain Chinese KB also restricts the research on the Chinese KB based QA technology.This dissertation focuses on the task of KB based QA and conducts researches from several aspects including semantic representation of the question,preparation of training dataset,Chinese KB construction and so on.The main research contents includes word embedding construction method for paraphrase scoring in KB based QA,KB based QA method combining neural network question generation and knowledge fusion method in the process of Chinese KB construction.The conventional learning algorithm of word embeddings is task-independent and unsupervised,and the word embeddings can not express the relations of semantic constraint at sentence level.Therefore,this dissertation proposes a method of constructing word embeddings utilizing the constraints of paraphrases.This method introduces the semantic constraint at the sentence level in the process of word embedding training,and improve the semantic representation at the sentence level on the premise that the synthesis method of the meaning of sentence is not changed by optimizing the semantic embeddings at the word level.Finally,our method achieve the effect of improving the accuracy of paraphrase scoring and question answering of KB based QA system.The performance of the existing KB based QA method based on vector space modeling depends on the amount of training data,but the manual generation of the large scale question answer pairs is very difficult.In order to solve such problem,this dissertation introduces the question generation method based on encoder-decoder neural networks into the construction of KB based QA system.This method realizes the automatic generation of questions which are used in the training of KB based QA model from the triples in KB by constructing the question generation model.The results of experiment show that the question generation method based on neural networks can effectively improve the accuracy of the KB based QA system compared with the question generation method based on templates.Finally,this dissertation introduces a Chinese KB constructing method based on knowledge fusion.This method constructs the initial KB by extracting information from the infoboxes of Baidu Baike,and then realizes the fusion of the initial KB and Freebase by adopting the methods of entity mapping based on linking words information and attribute mapping by computing the Jaccard coefficient.
Keywords/Search Tags:knowledge base based question answering, word embedding, question generation, encoder-decoder, open domain Chinese knowledge base
PDF Full Text Request
Related items