Font Size: a A A

Research On Automatic Text Summarization Generation Technology Based On Deep Learning

Posted on:2021-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:K J GuoFull Text:PDF
GTID:2518306494995989Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Automatic text summarization technology is one of the effective ways for people to quickly obtain text information from the massive data on the Internet.This paper investigates the research background of the automatic summary task and finds that the task has many application scenarios in actual production and life.According to the current situation of automatic text summarization tasks at home and abroad,it is mainly divided into extractive text summaries and generative text summaries.Compared with the former,generative text abstracts can generate abstract texts that are more in line with people's reading,and have certain advantages in terms of abstract grammar and abstract quality.Therefore,this paper mainly studies the generative text summarization model based on sequence to sequence.The main research contents of this paper include:First of all,this article investigates the research background of automatic text summarization and understands the significance of tasks for scientific research and daily life.In addition,this article has conducted a detailed analysis of the current research status of the task at home and abroad,and found some bottlenecks and challenges currently encountered.The basic word vector representation method and language model in natural language processing tasks are introduced.The basic principles and advantages of recurrent neural networks and convolutional neural networks are analyzed.In addition,this article reviews the pre-trained processing models in the field of natural language processing in recent years.Secondly,this article introduces the text summarization model based on the copy mechanism.On the basis of the classic sequence-to-sequence framework,a selfattention mechanism and a copy mechanism are added to effectively alleviate the common problems of OOV and unregistered words in automatic text summarization tasks.Unsolved the problem of neural network memory for long text sequences,in Chapter 4,the text summarization model based on BERT is introduced to obtain rich semantics in source text information on the encoding side.The word embedding representation of the source text is obtained through transfer learning,and fine-tuned for automatic text summarization tasks.At the same time,the encoded information is filtered through the gating unit to generate a more readable and higher quality text summary.Finally,this article is tested on the large Chinese short text data set LCSTS and the English data set CNN/Daily Mail.The validity of the model proposed in the article is verified in detail,and the process of generating abstracts is analyzed through specific abstract examples.The experimental results show that the model proposed in this paper has a certain improvement compared with the baseline model,and generates highquality text summaries,which effectively improves some common problems in automatic text summarization tasks.
Keywords/Search Tags:deep learning, automatic text summarization, natural language processing, sequence to sequence model
PDF Full Text Request
Related items