Font Size: a A A

Research On Chinese Knowledge Bases Question Answering Based On Pre-trained Language Model

Posted on:2022-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:T H ZhangFull Text:PDF
GTID:2518306329461084Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Knowledge bases question answering is an important research direction in the field of question answering systems,natural language processing and artificial intelligence.Unlike traditional search engines,question answering systems aim to answer natural language questions posed by users with accurate.Knowledge graph is a technical approach to describing knowledge and semantic relationships between things using a graph structure framework and are an important foundation for artificial intelligence applications.The main task of Knowledge bases question answering is to understand the natural language questions entered by the user,to reason and search on the knowledge graph and to output the answers to the user.At present,a range of approaches have been proposed in academia for the Chinese Knowledge Bases question answering task,but a number of problems and challenges remain.These problems can be summarized into two aspects: On the one hand,because the cost of constructing a knowledge bases question answering data set is high,there are generally small data sets and poor quality in the Chinese field;on the other hand,Chinese language is very different from English in semantic representation.Some mature English knowledge bases question answering methods cannot be copied to the Chinese field,and the data sets are also unable to share.In addition,most Chinese knowledge bases question answering research focuses on simple single-hop problems,and performs poorly on complex problems.The main reason is that traditional methods decompose multi-hop problems into multi-step classification and reasoning,how to avoid multi-step reasoning in the method is a question worth exploring.In response to the above problems,this article has made improvements and innovations on the previous studies.The specific work is as follows:(1)The Chinese knowledge bases question answering method ECKBQA(ERNIE-based Chinese Knowledge Base Question Answering)based on joint training and pre-training language model is proposed,and a web application is developed on this basis,which can satisfy users in real environments.Aiming at the problem of insufficient annotation data for Chinese knowledge questions answering,this method uses the knowledge-enhanced pre-training language model ERNIE(Enhanced Representation through k Nowledge Int Egration).to improve the performance of traditional semantic analysis methods in the subtasks of problem entity recognition,entity disambiguation and relationship prediction.A joint training mechanism is introduced into the model,which effectively combines relevant information in entity disambiguation and relationship prediction tasks,and further improves the efficiency of data utilization.Based on the above work,we develop the Chinese knowledge question answering application,deploy the Web application in the real environment.(2)A multi-hop Chinese knowledge question answering method MHCKBQA(Multi-Hop Chinese Knowledge Base Question Answering)based on knowledge representation and pre-trained language model is proposed.This method realizes the answer to complex multi-hop questions by combining the answer scoring mechanism of knowledge graph embedding scoring and link scoring.Traditional semantic parsing methods treat entities and relations in the knowledge graph as symbols for processing.When faced with complex problems that require multiple inferences to obtain an answer,the accumulation of errors between multiple steps can lead to a reduction in the accuracy of question and answer methods.The introduction of deep learning can represent entities and relations as distributed vectors in a low-dimensional space,so that the answer prediction problem can be transformed into a problem of computing semantic similarity between vectors,which can effectively slow down the error accumulation during multi-hop inference.Based on this idea,this paper combines the knowledge graph representation learning method Compl Ex with the pre-training language model ERNIE,and proposes a multi-hop Chinese knowledge question answering method MHCKBQA.At the same time,it is proposed to introduce a link scoring mechanism into the method to enhance the robustness of the method.Awesome.We conducted comparison experiments on the Chinese Knowledge Bases Question Answering dataset,and ECKBQA achieved an average F1 value of 86.41 on the NLPCC-ICCPOL 2016 KBQA dataset,and MHCKBQA achieved an average F1 value of 68.72 on the NLPCC-MH dataset,both of which are the best performance among the comparison methods,validating the effectiveness and innovation of the method in this paper.
Keywords/Search Tags:Knowledge graph, question answering system, pre-trained language model, knowledge graph embedding, link prediction
PDF Full Text Request
Related items