Font Size: a A A

Research On Knowledge Base Question Answering Method Based On BI-LSTM-CRF Model

Posted on:2020-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:F R ZhangFull Text:PDF
GTID:2428330623966991Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The traditional search engine searches by keyword combination and returns a series of related web pages,which requires the user to filter multiple times to obtain the required answers.The knowledge base question answering(KBQA)system,by integrating the advantages of IR(Information Retrieval)and NLP(Natural Language Processing),takes natural language questions as input and then outputs concise and accurate answers in natural language,which can better meet the needs of contemporary people to acquire information quickly and accurately.Based on the analysis of the existing KBQA system,it is found that the open domain KBQA in English field not only supports the single relationship question answering(SRQA),but also supports the multi-relational question answering(MRQA).But in Chinese field,the current research is mostly focused on SRQA,and the MRQA is still in the exploration stage.Based on the knowledge base(KB)provided by NLPCC-ICCPOL 2016,MRQA in Chinese domain is explored in this thesis.The KBQA process is divided into three sub-tasks: entity recognition,entity relation extraction and answer retrieval.These three sub-tasks are the focus of this thesis.The main research works are as follows.(1)Entity recognition,linking and disambiguation based on BI-LSTM-CRF(Bidirection Long Short-Term Memory Conditional Random Field)model.Aiming at the diversity of entities in natural language questions,BI-LSTM-CRF model was introduced to learn the relationship between the entities and the surrounding words in the question,and an entity recognition model was trained to identify the entities in the questions.Then the entities were linked with the terms in the KB through calculating similarity and querying the entity mapping table.Finally,the candidate entities were disambiguated by the combination of similarity and entity popularity,and the redundancies generated after entity linking were eliminated.(2)Entity relation extraction based on POS(part of speech)feature and position feature.In order to solve the problem that the natural language question contains more than one relational word in the MRQA,a relational word extraction algorithm based on POS feature was proposed in this thesis,which was used to obtain the candidate relational word sequence from the question.Then the relational word sequence was mapped to predicates in the KB through similarity calculation.Finally,according to the entities,relationships,relative positions between interrogative pronouns and their respective numbers,the four kinds of question types,single entity single relationship(SS),single entity multi-relationship(SM),multi-entity single relationship(MS)and multi-entity multi-relationship(MM)were generated to a set of triples with semantic information.(3)Answer retrieval based on template matching.For the relationships between the triples,or the relationships between the internal elements of the triples,five templates were defined in this thesis.Each template defines a mapping rule between a triple type and a Cypher query.Then the triples are converted into a series of query statements by matching the templates.Finally,the candidate answer sets are obtained by searching in the KB with the query statements.Comparative experiments were performed using the data set provided by NLPCCICCPOL 2016 and the multi-relational data set NLPCC_MH.The results show that the method proposed in this thesis can not only support SRQA,but also support MRQA.In addition,alternative questions can also be answered by using our method.
Keywords/Search Tags:knowledge base question answering, natural language processing, entity recognition, entity relation extraction
PDF Full Text Request
Related items