Font Size: a A A

Research On Abstractive Text Summarization Model Based On Transformer

Posted on:2021-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z K ZhouFull Text:PDF
GTID:2518306104995529Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Text summarization task aims to generate a short,coherent natural language summary of the source document while retaining key information from source text.Different from the traditional extractive text summarization,abstractive text summarization abstracts the content of the source text and then regenerates a summary text.At present,the mainstream methods are applying neural network models to abstractive text summarization.These methods generally use a sequence-to-sequence(Seq2Seq)framework to build a summarization model.And summarization models based on Seq2 Seq usually introduce some optimization components to solve the abstractive text summarization problems,which include the out-of-vocabulary(OOV)problem and duplicate words problem.The examples of optimization methods are the using of pointer-generator structure,coverage mechanism and copy mechanism.In fact,these optimization methods are designed to imitate the process by which human summarizes documents,in order to optimize the performance of summarization models.The abstractive text summarization model proposed in this thesis is different from the mainstream summarization models based on Seq2 Seq.Our model directly views the text summarization task as a language modeling problem.It concatenates the input and output text into a joint sequence,and then encodes the joint sequence through a public transformer as an encoder,and finally forms a language model for text summarization tasks.With this approach,the parameters of the text summarization model can be initialized by the pretrained language model based on transformer,so that the text summarization model can utilize the powerful text representation capabilities of the language model trained on largescale corpus.In addition,the model uses two-stage training tasks,including fine-tuning training and end task training.Fine-tuning training is trained on the pre-trained language model by using a corpus related to the text summarization task,while end task training is trained on the final text summarization model so that the model can predict new data and generate a summary from source text.The combination of two-stage training makes the pretrained language model more adaptive for text summarization task,and also enables the text summarization model to learn text semantic information on a deeper level.The summarization model proposed in this thesis uses a large scale Chinese short text summarization dataset(LCSTS)for model training and model testing.Through experiments and evaluations on this dataset,the ROUGE evaluation results show that the model has achieved state-of-the-art performance.Moreover,the generated summary results contain a very high degree of richness of the subject information from source document,and the summary sentences are also very coherent.
Keywords/Search Tags:Abstractive text summarization, Neural network, Language model, Sequence to sequence
PDF Full Text Request
Related items