Font Size: a A A

Research On Automatic Question Answering System Based On Large Scale Chinese Knowledge Base

Posted on:2022-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:C L XiFull Text:PDF
GTID:2518306326453024Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Knowledge base question answering integrates the core technology of natural language processing,and aims to use the triples in the knowledge base to answer users' natural language questions.It has gradually become a new trend of human-computer interaction.In the face of large-scale knowledge base,automatic question answering technology provides an efficient and accurate way to obtain information.In recent years,some large-scale knowledge bases have emerged at home and abroad,such as freebase and DBpedia abroad,as well as the Open Domain Chinese knowledge base provided by NLPCC in 2016 under the kbqa evaluation task,and so on.They all provide a large amount of data base,which makes automatic question answering based on large-scale knowledge base become the research hotspot of scholars in the field of NLP again.The automatic question answering based on knowledge base is centered on the triples of knowledge base.The natural language technologies involved include named entity recognition,relation extraction,entity linking,entity disambiguation,answer retrieval and so on.The answers are obtained by analyzing the questions and searching the knowledge base.However,due to the differences in the field of knowledge base,large scale and other issues,knowledge base Q & A has been facing challenges.At present,question answering in knowledge base mainly focuses on question parsing and triple matching,which is the basis of answer retrieval.However,traditional methods tend to ignore the association between different matching subtasks,such as entity extraction and relationship extraction in problem resolution,which leads to the problems of isolation and redundant information dissemination between tasks.In addition,due to the poor annotation of large-scale data,low-frequency words and unregistered relationships will also affect the generalization performance of the model.In this paper,the deep neural network is used to establish a joint model to improve this kind of problem.In this paper,the knowledge base question answering task is divided into three sub steps: question parsing,entity linking and disambiguation,and answer retrieval.Based on the above foundation,the main work of this paper includes the following aspects1.A joint SA model based on self attention mechanism is proposed.Question parsing is the first step of question answering in knowledge base.In this paper,extraction.In view of the previous pipeline extraction method of extracting entities first and then extracting relationships,it often ignores the relationship between entities and relationships,and causes errors or redundant information in entity extraction to affect the results of relationship extraction;while joint extraction is more concerned with the relationship between subtasks,and establishes entity relationship Association on the basis of multi task learning by sharing the underlying and other methods The results show that the method is simple and effective.On the question and answer data set of nlpcc,the F1 value of 80.2 is obtained.2.Implement an end-to-end Chinese knowledge base question answering model joint KBQA.Using entity linking,negative sampling and other techniques to process entity and relationship data,the entity mapping dictionary and negative sample training set of relationship are established,and the joint model is trained.In entity recognition,the main task is to recognize the boundary of the entity,while in relation extraction,relation extraction is regarded as a two classification task.The experimental results show that the model is effective,with an average F1 value of83.7,which is higher than the average F1 value of question answering system officially released by NLPCC.3.For the first time,the method of entity relationship joint extraction is applied to large-scale Chinese dataset NLPCC-KBQA.The average F1 of QA system is 83.7,which is higher than that of other published QA systems.In addition,compared with the English data sets which are widely used in the model,this paper provides the design idea and implementation method of the Chinese question answering model,which has practical significance.
Keywords/Search Tags:joint extraction knowledge base, question and answer, entity recognition, attention mechanism
PDF Full Text Request
Related items