Research On Question Generation Based On End-to-end Model

Posted on:2022-11-16

Degree:Master

Type:Thesis

Country:China

Candidate:S Y Bai

Full Text:PDF

GTID:2518306758991579

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

The emergence of the Internet enables people to obtain information they need or share useful information anytime and anywhere.Therefore,with the continuous development of the Internet,the scale of information has developed to a degree that people are difficult to filter.So how to extract useful information from a large amount of information has important value of search.As a challenging task in natural language processing,question generation aims to generate appropriate questions according to the given text and target answers.This task can automatically expand the question answering dataset thus plays an important role in many fields such as reading comprehension.It can also provide cold start topics for dialogue systems.Therefore,how to improve the performance of question generation task is an important research goal.The traditional question generation methods are mainly based on rules.Although it has achieved good results,it requires much labor cost to formulate rules,and can only be applied to datasets in specific formats.With the continuous development of artificial intelligence,the concept of deep learning has been put forward.By means of establishing artificial neural network to simulate people's way of thinking,it has achieved breakthrough progress in many fields.In the question generation task,methods based on deep learning mainly uses the end-to-end models,obtaining significantly improvement compared with the rule-based methods.Therefore,the question generation methods based on end-to-end model have become the main research direction at present.This paper mainly studies how to model the complex semantics of input text when it is long,and pays attention to the task of question generation based on facts.The task transforms the input text into a knowledge graph,takes the target answer as a query node,extracts the fact path related to the answer,and explicitly models the semantic information related to the answer to generate more relevant questions.However,as a part of knowledge graph,path not only contains sequence information,but also has the local structure of triple.Existing methods are difficult to model the complex information of path completely.Moreover,the nodes on the path are different from the tokens in the text sequence.They also have two attributes: entity and relationship.How to model separately is also a part to be solved.Therefore,this paper proposes a method based on semantic feature extraction and hierarchical structure to encode the path.Considering the global sequence structure information of the path and the different structure information of different triples,the two kinds of information are extracted as the global and local features of the path respectively,and then the two features are integrated into the path representation through the hierarchical structure.Moreover,in order to distinguish feature differences caused by node types during local feature extraction,a novel bi-directional convolution neural network structure is proposed in this paper;Self attention mechanism is also introduced in global feature extraction.At the same time,the input text is also encoded to obtain its context representation.Finally,the path representation and input text representation are used to initialize the decoder to generate appropriate questions.In order to evaluate the performance of the proposed method,this paper uses multiple end-to-end methods to conduct comparative experiments on the SQu AD dataset,and designs multiple ablation experiments to verify each module proposed in this paper.The experimental results show that the proposed method performs better than other methods,and the performance is reduced in various degrees on each ablation model.So it is proved that this method can effectively model the semantic information in the path to assist question generation.Based on the above methods proposed in this paper,considering that the extracted multiple features may have different contribution to the path under different input conditions,we design a gated feature fusion mechanism to replace the splicing operation,and the weighted feature representation is obtained by dynamically obtaining the proportion of features.In order to better illustrate that the method proposed in this paper can model the path semantic representation related to the answer,and make full use of this semantic representation,this paper further proposes a multi-task learning framework,which uses the feature of the path to predict the question words generated by the question,and replaces the start tag as the initial input of the decoder.The semantic information related to the answer is used to guide the questioning direction of the model.A hyperparameter is used to joint question word prediction task and question generation task in the training process.The experiment also proves the effectiveness and feasibility of the method further proposed in this paper,and prediction task of question words has also achieved good results.

Keywords/Search Tags:

Question generation, Bidirectional long short term memory, Convolutional neural network, Self-attention mechanism, Multi-task learning

PDF Full Text Request

Related items

1	Text Sentiment Classification Based On Attention Mechanism
2	Research On Network Intrusion Detection Method Based On Bi-LSTM
3	Text Classification Research Based On Deep Neural Network And Attention Mechanism
4	Research On Chinese Text Detection In Natural Scene Based On Deep Learning
5	Research On The Stance Detection In Social Network Text Based On Deep Learning
6	Research On Relation Classification Via Bidirectional Long Short-Term Memory Networks With Attention Mechanism
7	Research On Short Text Emotional Tendency Analysis Based On Deep Learning
8	Research On Affective Visual Question Answering
9	The Cross-site Script Detection Based On Deep Learning
10	Level And Aspect-term Sentiment Analysis Based On Deep Learning