Font Size: a A A

Research On Automatic Abstractive Summarization Technology Based On Deep Learning

Posted on:2022-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Q M LiangFull Text:PDF
GTID:2518306335484624Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the advent of the big data era,the global Internet industry has shown great development vitality and resilience.While the digital infrastructure and digital economy are developing rapidly,the scale of Chinese Internet users,the Internet penetration rate and the average weekly online time have been rapidly increasing and expanding.This has caused netizens to be flooded with massive amounts of network information every day,resulting in a serious information overload problem.How to solve the current information overload problem is very urgent,and it is necessary to extract and filter massive amounts of information to reduce the burden.The most important part of information extraction is automatic summarization.Automatic summarization uses a computer to briefly summarize the central content of the text.The summarization reflects the main idea of a text.This paper studies the current domestic and foreign research status of automatic summarization technology.The automatic summarization system has changed from a simple statistics-based abstract extraction method to a feature engineering-based machine learning method,and then transitioned to a sequence-to-sequence research method based on deep learning in recent years.Especially the automatic summarization method based on deep learning,its powerful semantic representation capabilities and convenient end-to-end summary generation capabilities have brought development opportunities to the development of text summarization technology.However,there are some shortcomings in existing tasks that need to be resolved urgently.Therefore,in response to the existing problems,this article explores how to better perform abstract modeling based on deep learning methods to improve the performance of automatic abstracts.Aiming at the slow training speed of sequential neural networks and the improvement of the model by lightweight network blocks,this paper proposes an automatic summarization method based on full convolutional neural networks.The model input combined with the position vector allows the convolutional neural network to obtain the position information of the text,and uses the Compressing block and the Fusion block to make lightweight improvements to the convolutional neural network to enhance the model feature mining ability;at the same time,it combines the residual network block and multiple step-attention mechanism models keywords and key sentences separately to understand and integrate the semantic content of the text;this paper uses GLU(Gated Linear Units)to perform nonlinear calculations and copy mechanisms to solve rare vocabulary problems.Finally,the algorithm verified the effectiveness of the model on the CNN/Daily Mail data set.The results show that the algorithm model is superior to most current text summarization models based on recurrent neural networks,and has consistent substitution and significant statistical significance.In addition,this article conducts in-depth research on the automatic summarization technology based on sequence neural network.In terms of feature extraction of text information,it first summarizes the BERT-WWM(BERT Whole Word Masking)and masking method to build context and text features;secondly,it incorporates text features based on statistics and rules in Chinese linguistics,such as part-of-speech features,Location characteristics,theme characteristics;Finally,the Coverage mechanism is introduced to comprehensively summarize the text content.On the basis of considering the artificial abstract form of the data set,combined with the inherent characteristics of the text itself,the two-way LSTM(Long Short-Term Memory)is used to automatically extract the text information.Compared with traditional methods,deep learning methods with rich feature sets improve the performance of automatic summarization.
Keywords/Search Tags:Automatic summarization, CNN, Copy mechanism, BERT-WWM, Coverage mechanism
PDF Full Text Request
Related items