Research On Abstract Text Summarization Based On Sequence-to-sequence

Posted on:2022-09-28

Degree:Master

Type:Thesis

Country:China

Candidate:C Tang

Full Text:PDF

GTID:2518306530980179

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

The popularization of the Internet facilitates the circulation of information,and also makes everyone unable to support the massive amount of information.Solving information overload and accelerating the acquisition of key information in text data have become an urgent need in the era of big data.Summarization is the epitome of the essence of text content,which can improve the efficiency of users’ reading and comprehension,and automatic text summarization technology came into being.With the rise and development of deep neural network research,its powerful learning and representation capabilities have allowed the rapid development of abstract text summarization technology,especially the sequence-to-sequence model framework based on deep learning.This article starts with the research from the sequence-to-sequence model framework,focuses on the key issues of abstract text summarization,and explores the generation of more condensed and smooth summaries.Aiming at the problem of unregistered words and repetition in the process of generating abstracts from traditional sequence-to-sequence models,the paper proposes an enhanced hybrid coding model,which combines bidirectional LSTM with convolutional gating by improving the traditional sequence-to-sequence encoder part Unit,better monitor the global information of the source text,reduce the repetition of generating summary.In addition,the hybrid coding model is combined with the pointer network,and the input source text can be directly quoted by using the pointer network to solve the problem of unregistered words.Aiming at the defects of traditional sequence-to-sequence frame feature extractors.The paper proposes a new sequence-to-sequence text summarization method FNP-TRF,which is based on the Transformer language model.The feature extractor of its encoder and decoder adopts a full self-attention mechanism without considering the distance between input and output sequences.Completely rely on their global dependencies.In addition,the relative position encoding and n-stream Self-Attention mechanism are added to the Transformer,and the masking method in the decoder is modified to solve the problem of the lack of directionality of the Transformer,and allow the model to be trained in an efficient manner.Complete the prediction of N characters in the future at the same time at each moment,and strengthen the learning and representation capabilities of the model.The proposed method is tested on the Chinese data set LCSTS and the English abstract data set CNN/Daily Mail.The experimental results verify the effectiveness of the model.

Keywords/Search Tags:

Seq2Seq, Abstract, Hybrid coding, Transformer

PDF Full Text Request

Related items

1	Research And Application Of Abstract Method Of Chinese Web Text Based On Seq2Seq Framework
2	Research On Text Summary Generation Technology Based On Deep Learning
3	Research And Application Of Source Code Summarization Based On Seq2seq
4	Research And Implementation Of Automatic Abstract Generation System Based On Deep Learning
5	Research Of Image Abstract Generation Based On Deep Learning
6	A Lightweight Multilingual Translation Model For Asian Languages
7	The Research On Chinese Automatic Abstract Generation Technology Based On Deep Learning
8	Research On Product Marketing Copywriting Generation Based On Transformer Improvement Model
9	Research On Deep Post-processing With Reference In Hybrid Video Coding Framewor
10	Research On Chinese Single Document Automatic Summarization Based On Deep Learnin