Font Size: a A A

Research And Application Of Text Summarization Model Based On Deep Learning

Posted on:2021-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChangFull Text:PDF
GTID:2428330620964023Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the birth and development of many new media in the Internet age,the speed of manual processing and refining information has lagged far behind the exponential growth of text information,resulting in a large number of the latest information can not be processed promptly,make people face more and more information can not be read.Automatic text summarization technology is an important field of natural language processing,which can help people to streamline,refine and summarize important information from text,filter useless information,improve reading efficiency,improve information processing efficiency,reduce human and material costs,and increase social productivity.Thanks to the vigorous development of deep learning,abstractive summarization has made new progress and breakthroughs in recent years.In this thesis,I build an abstractive summarization model based on deep learning to solve some problems of the previous deep learning abstractive summarization model and improve the quality of the text summarization.The main contributions of this work include:(1)A text summarization model based on improved attention is proposed.The encoder and decoder maintain the ability to automatically generate text summaries.The embedding layer uses a generalized autoregressive pre-training model to extract semantic features,fully learn textual contextual information,and deeply mine the intrinsic features of text.(2)For the out-of-vocabulary(OOV)problem,improve the accuracy of the text summary by using the pointer generator to select copy words directly from the source text.(3)In response to the problem of text summary word repetition,two types of improved attention are proposed and used,using historical cache attention to reduce previously generated high attention parts,allowing the model to focus more on previously less attention parts,encouraging the model to pay attention to more words,and using differential attention to allow the model to pay more attention to historically generated text summary words,avoiding duplicating output when new words are generated.(4)For the exposure bias problem,a new loss function is proposed to increase the stability of training by fusing reinforcement learning to avoid a large difference betweenthe model's evaluation score on the training set and the evaluation score on the test set.(5)The design and implementation of a text summarization system,with the improved attention abstractive summarization mode proposed in this thesis as the core,can automatically generate high quality text summarization on user provided source text data and present them to users in the browser..Finally,the model was trained and tested on the CNN/Daily Mail public data set using ROUGE evaluation indicators,and the experimental results showed that the model in this paper was effective in improving ROUGE-1,ROUGE-2,and ROUGE-L evaluation indicators,and the ROUGE-1 score was at least 0.57 higher than other current advanced algorithm models,which proved the effectiveness of the model method improvement in this thesis.The system can meet the needs of users to summarize articles and improve reading efficiency.
Keywords/Search Tags:text summarization, natural language processing, deep learning, attention, reinforcement learning
PDF Full Text Request
Related items