Font Size: a A A

Research On Knowledge Base Question Answering Based On Multi-level Entity Labeling And Semantic Enhanced Representation

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z F YuFull Text:PDF
GTID:2428330605461311Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,with the flouring of large-scale open domain knowledge bases(KBs)such as Freebase,DBpedia and Wikidata,knowledge base question answering(KBQA)has become one of the hot spots in the field of deep learning and natural language processing(NLP).KBQA system can satisfy the user's efficient and intelligent search needs by returning the answer in the KB to the natural language question,provide a natural and direct human-computer interaction method,which has important application value and research significance.For the KBQA task,we employ deep learning-based methods to map the question and fact triple into continuous vectors,and calculate the similarity between them.The method includes two steps:candidate generation and candidate scoring.The main contributions are as follows:KBQA system retrieves candidates from the KB based on the topic entity of the question.Topic entity labeling is the first step of the system,which has an impact on the subsequent steps.Previous research usually applied neural network based on word and character embedding to extract question topic entity,but only using word and character embedding could not completely represent the semantic information of question,and could not distinguish the ambiguous terms,which affected the labeling results.In this thesis,we propose a multi-level semantic representation model for topic entity labeling,learning multi-level semantic information of question through word,character and context representation embedding.The context representation learned through the convolutional neural network(CNN)can capture the context informationˇof the word and generate different vector representations to obtain a more complete semantic representation of the question,and to better handle out of vocabulary words(OOV).After that,the topic entity is annotated by a bidirectional long short-term memory and conditional random field(BiLSTM-CRF)model.The accuracy of the topic entity labeling model with multi-level semantic representation on Chinese and English datasets reaches 91.32%and 96.84%respectively.KBQA system requires understanding natural language question and triple information to score candidates,which is still a challenging.Most methods use neural networks to learn question and predicate representations for candidate scoring,and there are also methods to learn the semantic representations of subject and predicate to match question respectively.However,these methods do not take into account the rich extra triple information and structural information contained in the large KB,only considering the information of a single candidate,nor do they treat the triple as a whole.In this thesis,we propose a knowledge-enhanced deep semantic representation model with attention mechanism to learn the semantic representation of question and triple.The model utilizes knowledge graph embedding to learn the knowledge representation of the subject,and integrates the semantic information of the predicate through attention mechanism to obtain the overall representation of the triple,capturing the knowledge and semantic information of the triple.Knowledge-enhanced semantic representation model can model the semantic information of triple and question,and handle the semantic gap between them.We also exploit dynamic negative sampling strategy to assist model training.The model we proposed achieves an accuracy of 77.2%on the SimpleQuestions dataset,and the Average F1 value in the NLPCC 2016 Chinese KBQA dataset is 81.01%.The model proposed in this thesis can obtain competitive results with other models.
Keywords/Search Tags:Deep learning, Knowledge base question answering, Semantic representation, Knowledge graph embedding
PDF Full Text Request
Related items