Seq2seq Attention:Super Long Chinese Text Summarization Model

Posted on:2021-02-12

Degree:Master

Type:Thesis

Country:China

Candidate:Z J Yao

Full Text:PDF

GTID:2518306131992869

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Text summarization is a hot frontier problem in natural language processing.Existing methods have the problem of low accuracy when performing long text summarization.Seq2seq(sequence-to-sequence)deep learning technology based on long-short term memory(LSTM)has achieved remarkable results in natural language processing.In this thesis,we are dedicated to building an improved Seq2 seq network model to handle Chinese long text auto-summarization.The main work of this article consists of the following parts:Firstly,an Attention Seq2 seq network model for super long Chinese text summarization is given and implemented.Secondly,in view of the problem that there are relatively few researches on long text summarization in China and the lack of currently available data sets of Chinese long text summarization,this thesis selects the news data sets of sogou laboratory,preprocesses the data,obtains the super-long text data sets that can be used for network training,and constructs a vocabulary.Thirdly,comparison experiment on model index value: We randomly selected 50 samples as the test sample,used the rouge-1,rouge-2,rouge-L index as evaluation index,and then used improved Seq2 seq model,Textrank model and TF-IDF model to predict research samples respectively,and calculated the corresponding values of rouge,and contrasted the rouge values of each model experiment.The Experimental results showed that compared with Textrank model,the rouge-1 value of the model in this thesis increased by about 0.64,the rouge-2 value by about 0.59,and the rouge-L value by about 0.66.Compared with the TF-IDF model,the rouge-1 value increased by about 0.67,the rouge-2 value by about 0.61,and the rouge-L value by about 0.69.Fourthly,comparison experiment on model summarization generation: We used the improved Seq2 seq model,Textrank model and TF-IDF model respectively to conduct prediction research on the same super-long text,and made comparison experiments,and finally obtained the results: Both the improved Seq2 seq model and Textrank model have good effects,and both can extract the semantic features indicating the central theme,and both are better than the TF-IDF model.In addition,the improved Seq2 seq model has a better summary of the abstract,and the predicted summary should be more concise.

Keywords/Search Tags:

Text summarization, Super-long text, Seq2seq, LSTM, Attentional mechanism

PDF Full Text Request

Related items

1	Improved Attentional Seq2seq With Policy Gradient For Text Summarization
2	Research On Text Summarization Generation Technology Based On Neural Network
3	Research On Automatic Generation Of Chinese Text Abstract Based On CNN
4	Improved Seq2Seq Text Summarization Generation Methods
5	Research And Implementation Of Automatic Text Summarization Based On Seq2Seq Model
6	Research On Automatic Generation Method Of Chinese Text Summarization
7	Research On Text Summarization Method Based On LSTM Sequence To Sequence Model
8	Research On Chinese Text Summarization Algorithm Based On Deep Learning
9	Research On Text Summarization Algorithm Based On Deep Learning
10	Research On Key Techniques Of Two Phase Automatic Summarization Algorithm For Long Text