Font Size: a A A

Research On Knowledge Base Question Answering Based On Deep Learning

Posted on:2021-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:C L ZhaoFull Text:PDF
GTID:2428330629951231Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The purpose of knowledge base question answering(KBQA)is to give accurate answer based on the knowledge base and the natural language question raised by user.In recent years,the rapid development of data mining and information extraction technology,which promotes the emergence of large-scale and domain rich knowledge base,and provides a data foundation for research of KBQA.Due to the variety of expressions of natural language question,how to get the topic entity in the question,and how to accurately match question with structured triples in the knowledge base are the focus of this study.This paper divides the KBQA task into two stages: topic entity recognition and candidate triples ranking.In the topic entity recognition stage,the topic entity in the question is first identified by the entity recognition based on deep learning and transfer learning.In the candidate triples sorting stage,the semantic similarity and character similarity of the question and the candidate triples are calculated respectively,and then sorts by fusion.The NLPCC-ICCPOL 2016 KBQA task released a large-scale Chinese knowledge base and related QA dataset.This paper experiments on this dataset and obtains the outstanding result.The main work of this paper is as follows:This paper proposes a Transfer Deep Entity Recognition(TDER)model,which combines transfer learning and deep learning for named entity recognition.By integrating the POS tagging results of external Chinese word segmentation tools into the input of entity recognition training,the problem of too small data set of entity recognition is solved.and multi-head attention mechanism is added between Bi-LSTM and conditional random field(CRF)to obtain the semantic relationship of any two characters in the question,so as to obtain the entire semantic information of the sentence,which effectively improves the accuracy of entity recognition,the accuracy of entity recognition is improved to 91.71%.In the semantic matching of question and triples,in order to capture the important information in the question and make full use of the information in the knowledge base,the KB-aware attention mechanism is added in the calculation of semantic vector of the question.The KB-aware attention mechanism consists of two parts: Self-attention and Add-attention.Self-attention is used to transform the semantic matrix of the question to feature vector,Add-attention is paid to obtain important attention points of the question through predicate information connected to topic entity in the knowledge base.Finally,the semantic vector of the question is obtained by combining the important attention points and the semantic matrix of the question.The experimental results show that the KB-aware attention mechanism increase by 1.97% in F1.This paper proposes a Double Level Semantic Matching(DLSM)model,the model obtains the similarity of the question and the triple from the semantic similarity and the character similarity separately,which can solve the problem that single similarity cannot fully exploit similarity between question and triples.In the calculation of semantic similarity,the semantic vector representation of the question and the candidate triple is obtained through the Bi-LSTM and KB-aware attention mechanism,and the connection between the two semantic vectors is combined with the fully connected network to obtain semantic similarity.In the calculation of character similarity,first the character similarity matrix of the question and candidate triple is constructed,then the feature is extracted by CNN,and finally the max pooling and full connection network is combined to calculate the character similarity.Experiments show that when semantic similarity and character similarity are used for semantic matching alone,F1 is 81.72% and 78.61% respectively,and F1 can be increased to 82.74% by integrating.The paper has 31 charts,19 tables and 86 references.
Keywords/Search Tags:knowledge base question answering, natural language question, named entity recognition, deep learning, Double Level Semantic Matching
PDF Full Text Request
Related items