Font Size: a A A

Research On Multi-topic Text Generation With Global Historical Information

Posted on:2022-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y B YuFull Text:PDF
GTID:2518306350951839Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the data volume exploding,people's demand for intelligent processing of data is increasing.The original algorithms and technical frameworks are far from meeting people's growing needs.People urgently need a new data processing paradigm,artificial intelligence technology based on neural network come into being in this context.Nowadays,artificial intelligence technology is affecting all aspects of human life in various ways.As one of the most challenging research suject in the field of artificial intelligence,text generation occupies an important position in the field of natural language processing.Although text generation has a long research history,most of them are open-ended text generation,such as machine translation and short text rewriting,which have strong alignment properties.There is less research on non-open-ended text generation,and even less on chinese non-open-ended text generation.Topic-based text generation is one of the non-open-ended text generation tasks.The ultimate goal is to generate fluent,readable,well-readable,and comprehensive expression of the semantics contained in the topic based on a number of given topics.With the rise of deep learning,methods based on deep learning provide massive new ideas for text generation.As an emerging research subject,topic-based text generation still has many problems.Since this task has just been proposed not long ago,although there is a public corpus,the quality of the data is poor;the existing text generation is most ly based on the encoder-decoder framework of recurrent neural networks.The recurrent neural network encodes the previous information into a fixed vector each time,and the ability to encode historical information is limited;in addition,this task is a non-open-ended text generation,and the dataset has a one-to-many phenomenon;The expression of topic semantics of the current model is still not accurate and comprehensive,and further improvement is needed;the text generation based on log-likelihood has the problem of exposure bias caused by the inconsistency of training and testing.In response to the above issues,this article has conducted more in-depth thin king and exploration,and achieved the following innovative results:(1)Designed and implemented a topic-based text generation method that introduces global historical information,including:a topic attention module,which calculates the attention weight of the current decoder's hidden state to the topic toencode topic information;a historical memory module that explicitly records the word information that has been generated before,stored in the historical memory module in the form of a vector,and uses a new attention mechanism to obtain a global historical vector for guidance text generation.(2)On the basis of the historical memory module,we propose a corrective attention model that introduces richness to quantify the degree of expression of topic semantics in the generated text.We use cosine similarity to define semantic richness,and semantic richness is used as a correction coefficient to adjust the weight of attention,avoiding the repeated expression and incomplete expression of the topic being generated by the text.This adjustment is achieved by increasing the attention weight of topics with less semantic expression before,and reducing the at tention weight of topics with more semantic expression before.(3)On the basis of the global historical information module,this paper incorporates a training framework based on reinforcement learning and adversarial neural networks,including:introducing the perspective of reinforcement learning,form alizing the sequence generated text into a sequence decision problem,and converting the previous logarithm The target of likelihood is modified to the expectation of the discriminator's penalty signal,and the penalty signal of a multi-label discri minator is used to guide the generation of the generator,so that the generator is more in line with the semantics of a given topic,and the evolution of the generator in turn promotes the discriminator's evolution.For the above ideas,we carray experiments to verify and compare them fairly with similar methods.The results show that the text generated by our model has higher fluency and is more in line with the semantics of given topics.
Keywords/Search Tags:text generation, attention mechanism, historical memory module, reinforcement learning, adversarial neural network
PDF Full Text Request
Related items