Font Size: a A A

Combining Knowledge Graph And Mutiple Encoders For Nerual Chinese Question Generation

Posted on:2020-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:M Q ChenFull Text:PDF
GTID:2428330623461015Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years,with the widely-spread applications of artificial intelligence in various fields,how to apply artificial intelligence to educational area becomes the focus point.Intelligent Inquiry is an embodiment of the application of the artificial intelligence in the field of education.And it is also an important part of constructing intelligent teaching environment.Intelligent Inquiry is an interdisciplinary research direction,including natural language processing,education,computer science,cognitive science,etc.Its main purpose is to automatically generate natural language questions from text.Intelligent Inquiry is also called Question Generation.Traditional question generation method is a rule-based or template-based method,which relies on humandesigned transformation and generation rules according to the text.However,rules or templates are easily over-designed,resulting in poor quality of questions.So it is difficult to meet the real needs.Recently,deep learning has been widely used in the field of natural language processing.In particular,it has made great progress in machine translation,intelligent question answering and other aspects,showing its potential application value.Therefore,researchers begin to explore question generation methods based on neural network.At present,The proposed methods based on neural netwok mainly focus on the English question generation,and there are relatively few studies on Chinese question generation.Therefore,this paper mainly focuses on the shortcomings of the existing neural question generation methods,and carries out a study on the neural Chinese question generation.We obtains the following research results: 1.Combining knowledge graph for neural question generationNeural question generation methods which have been proposed mainly focus on generating questions from the source text.The shortcomings of these methods are: When the source text has much information,the training difficulty of the model will be increased,resulting in poor quality of the generated questions.In addition,due to the lack of domain knowledge,the quality of questions generated from the source text is not high compared to the questions proposed by human.In response to these circumstances,taking WebQA dataset as an example,this paper proposed a method to simplify the source text,bring the prior knowledge related to the answer from the knowledge graph,and to realize Chinese question generation by using the neural network model.In WebQA dataset,each sample consists of a source text,a question,and an answer.The question is generated from the source text,and the answer is a word or phrase in the source text.Firstly,we referred to the relevant literature,and develop rules for simplifying the source text.Then we simplified the source text according to the rules in order to reduce the training difficulty of neural network model.Secondly,we used the answer as a key word to bring the prior knowledge from the knowledge graph.The prior knowledge can make up for the contextual information related to the answer which is filtered out in the process of simplifying the source text.Besides,the knowledge graph brings the possibility for the model to generate more complex and deeper questions.Next,this paper extended the WebQA dataset with the simplified source text and the knowledge graph triples,and constructed a WebQA dataset that combines the knowledge graph.The creation of this dataset can make up for the serious shortage of Chinense question generation datasets.Experimental results showed that the proposed method can effectively reduce the training difficulty of the model and improve the quality of questions.2.Neural question generation method based on mutiple encodersNeural question generation methods which have been proposed can not ask question according to the specified keyword or phrase(that is,the specified keyword or phrase is the answer to the question).Besides,the accuracy of the interrogative word in the question is not high.In response to these circumstances,this paper proposed a neural question generation model based on mutiple encoders.This model uses multiple encoders to encode multiple different types of text and encodes the answer(the specified keyword or phrase)using an encoder alone,with the goal of accurately expressing each type of data.When the decoder is generating a question,the attention will be distributed above the text in each encoder,the purpose of that consists of two aspects: one is to increase the attention of the answer,in order to promote the decoder to ask question based on the answer,and the second is to use the attention mechanism to effectively fuse different types of text.This paper used this model to experiment on the WebQA dataset that combines the knowledge graph and the structured dataset KBQG.Experimental results showed that the neural question generation model based on mutiple encoders can ask question based on the specified keyword or phrase on both datasets,and the model improves the accuracy of the interrogative words in the question.Finally,this paper further studied the influence of the text representation based on word embedding and the text representation based on BERT on the questions which is generated by the neural question generation model based on mutiple encoders.Experimental results showed that the text representation based on BERT can further improve the quality of the questions generated by the proposed model.
Keywords/Search Tags:Chinese question generation, deep learning, knowledge graph, multiple encoders, BERT
PDF Full Text Request
Related items