Font Size: a A A

Design And Implementation Of A Question Generation Algorithm For Multi-model Fusion

Posted on:2020-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:W H WangFull Text:PDF
GTID:2428330599451437Subject:Computer technology
Abstract/Summary:PDF Full Text Request
QA system takes the question sentence in the form of natural language as the input and infers the answer of the question from massive structured data or unstructured text as the output.At present,most QA systems need to mark question-and-answer pairs as training data,but the precision data set is very expensive,limited in size and domain.Therefore,this paper explores the QG algorithm,starting from the text paragraph containing the answer fragment,and using the knowledge points/facts in the sentence as the answer,to generate multi-angle questions with abundant information in reverse.The main work of this paper is as follows:(1)In this paper,on the basis of summarizing the research status of QG at home and abroad,a multi-model fusion QG algorithm is implemented.Given a text paragraph as input,two QG models and two QG optimization models were executed in parallel to obtain their sets of generated question sentences,which were input into the multi-QG model fusion module to calculate scores,the Top-10 question sentences were taken as output.(2)This paper proposes and implements a QG model based on problem model prediction.Automatically capture large-scale question and answer pairs from the community question and answer website,and use them as training data after processing.QG implements in four steps: question pattern mining,question pattern prediction,question subject word selection and question sorting.(3)In this paper,on the basis of sequence-to-sequence generation framework,a lexical constraint decoding algorithm LCD-GBS based on grid beam search is implemented,and end-to-end question generation is realized by combining the semantically related word list of self-training.(4)This article explores the correlation between the two major NLP tasks of QA and QG.The relationship between the two learns with two approaches,which are regarded as joint learning tasks,trying to improve both at the same time.The first approach takes them as dual tasks,proposes and implements an algorithm framework for simultaneously training QA and QG models,and explicitly uses their probabilistic correlation to guide the training process.The QA model implements by RNN,QG model implements by LCD-GBS sequence-to-sequence generation framework.The second approach looks at them as opposed to collaborative tasks.Unlike the standard GCN GAN,there is not always a competitive relationship between QA model(discriminant model)and QG model(generative model)in GCN.Experiments show that GCN improves both QA and QG tasks,and that "collaboration" is superior to "competition" in QA accuracy.(5)This paper implements the multi-QG model fusion module,the four evaluation criteria of grammar rules,subject rules,diversity and relevance are integrated into the linear weighting model to calculate the score of the generated candidate question sentences.Finally,this paper takes 30,000 randomly selected Quora question-and-answer pairs as the test set.Compared with basic Seq2 Seq model,the BLEU score of the multi-model fusion QG algorithm were up 26.3%,the correlation score were up 47.8%,the type correct score were up 52.0%,the correct questions score were up 28.6%,the improved fluency were up 56.5%,the question clarity score were up 18.5%,the question diversity score were up 70.0%.
Keywords/Search Tags:Question Generation, Question Answering, LCD-GBS, Dual Learning, Generative Collaborative Network
PDF Full Text Request
Related items