Font Size: a A A

Research On Automatic News Summarization Technology Based On Deep Learning

Posted on:2022-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:W J XuFull Text:PDF
GTID:2518306746983109Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
At present,with the continuous development and growth of Internet technology,the information age has also come.Especially with the exponential growth of text information,various news texts have brought huge challenges to people's reading.Coupled with more and more headline party news,how to obtain valuable information for readers from mass news is becoming more and more important.Automatic text summarization technology is a method that can generate concise and important information from news text collections,and it has become a research hot spot at home and abroad.The automatic summarization technology is distinguished according to the sentence composition of the summary,and can be divided into extractive automatic summaries and generative automatic summaries.The extractive method evaluates the importance of sentences in the original text and extracts high-importance sentences from the original text to form a summary,while the generative method uses a series of natural language processing techniques to generate a more concise and capable sentence composition abstract.Compared with extractive automatic summaries,generative automatic summaries are more in line with human habit of writing summaries,and have the characteristics of simplicity,flexibility and diversity.In recent years,the emergence of deep learning technology has promoted the vigorous development of generative automatic summarization.The current mainstream generative automatic summarization technology mainly uses the seq2 seq deep learning framework.The document is expressed as a vector through the Seq2 Seq framework,and then the document vector Decode the generated digest.This paper proposes a sequence-to-sequence deep learning framework based on pretraining model.First,the encoder encodes the input sequence and transforms it into vector representation with features.The decoder then forms a digest through two stages of decoding.For the encoder side of the model,ALBERT is used to convert the input sequence into a context representation.On the decoder side,the first stage first generates a draft sequence using a multilayer stacked decoder with a multi-attention mechanism.The second stage masks each word in the draft sequence and enters ALBERT to further transform it into a contextual representation.A multilayer stacked decoder with multi-attention mechanism is then combined with the context representation translated through the encoder to produce the summary output.The model is evaluated on the ROUGE and manual evaluation methods.Experimental results show that the model achieves good results on LCSTS and Gigaword datasets.In addition,this paper designs and realizes a news text summarization system.The text summarizes the overall architecture of the system,the overall system structure including the presentation layer and data layer,will show the input and output data of layers of text preprocessing module and the algorithm module integration,finally realizes the document can manipulate the front-end display interface,The text summarization system can generate high quality automatic text summarization through an example test.
Keywords/Search Tags:Text summarization, seq2seq, Transformer, deep learning, ALBERT
PDF Full Text Request
Related items