Font Size: a A A

Multi-Topic Text Generation Based On External Knowledge

Posted on:2022-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y D WangFull Text:PDF
GTID:2518306350451584Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Deep neural networks have been widely used in the field of natural language processing with the science and technology development and the improvement of computer computing power in recent years.Automatic text generation is an important and challenging research direction in the field of natural language processing.This article aims to generate a readable,topic-relevant text based on the given subject terms.This task has the following three difficulties:First,the source information is insufficient.The topic essay generation task in this article belongs to text-to-text generation.In the tasks of text summarization,text retelling,and machine translation,the input of the original text provides enough semantic information to generate the desired target text.However,the topic essay generation task aims to generate paragraph-level text based on multiple given topics only,and only a few topic words are input,which cannot provide enough source information.Whether it is from the perspective of novelty or related to the completeness of the topic and the topic,the extreme lack of source information may lead to the low quality of the generated papers.Second,the topic integrity and topic relevance of the generated text are insufficient.The former emphasizes that the generated article should contain these semantics of all input topic words,while the latter means that each generated sentence should closely surround one or more topics.Third,the problem of long-term dependence.Most of the traditional text generation models are based on RNN.Long-term dependence refers to the current system state,which may be affected by the system state a long time ago,which is a problem that cannot be solved in RNN.Aiming at the problem of insufficient source information,this article introduces an external knowledge base to provide external knowledge related to topic words to increase source information,and solve the problem of poor quality of generated text caused by insufficient source information.Aiming at the problem of topic integrity and topic relevance of the generated text,this paper proposes a topic word weight vector,which can be dynamically changed.Once a topic word appears,its corresponding weight will be reduced to avoid repeated occurrence of a single topic word,and then combined the attention mechanism guarantees to a certain extent the integrity of the topic words and the relevance of the topic words.For the long-term dependency problem that cannot be solved in RNN,this paper abandons the RNN structure,draws on the idea of RNN weight sharing and the transformer model and proposes the RTN(Recurrent Transformer Network)model.Since the RTN parameter is less than the Transformer,the training efficiency is improved while solving the problem of long-term dependency of the RNN.The article is based on two public data sets ESSAY,ZhiHu to train the model,and the evaluation results of the BLEU-2 scoring and manual evaluation methods confirm that the model in this article produces better text quality on the task of multi-topic text generation.
Keywords/Search Tags:Text generation, Transformer, External knowledge, Attention mechanism, Deep learning
PDF Full Text Request
Related items