Font Size: a A A

Abstractive Text Summarization Based On Transformer Model

Posted on:2021-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:J H LuoFull Text:PDF
GTID:2428330611965656Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet Technology,text information has grown exponentially,and people need to spend a lot of time and energy in text information processing and reading.How to quickly capture the required information from the massive text information,and then apply this information reasonably is a problem that needs to be solved urgently.Automatic text summarization technology is to use a computer to automatically generate summaries from original documents.It is an important technology in the field of natural language processing.In recent years,the sequence-to-sequence(seq2seq)model has been widely used in text summarization,providing a feasible solution for abstractive text summarization.A high-quality summary system usually relies on a powerful encoder that can extract important information from long input text so that the decoder can generate important summary information from the context generated by the encoder.In this paper,based on the standard Transformer model,the quasi-recurrent neural network and gating mechanism are introduced to improve the feature extraction part of the model,and the pointer generation network is fused to improve the quality of the generated summary.In this article,we propose an aggregation mechanism based on the improved Transformer model to solve the challenges of text representation.The main contributions include the following:1)Proposed an improved Transformer model.Specifically,the standard Transformer model has abandoned the traditional recurrent neural network RNN and convolutional neural network CNN,and only uses the attention mechanism for feature extraction.Although position encoding(Positional Encoding)is added,its position information is still not rich enough..By combining the quasi-cyclic neural network QRNN,the model's ability to capture sequence order and local information is improved.2)Improved Multi-head Attention,combined with gating mechanism.The Transformer model is formed by stacking multiple layers of modules.Gated Multi-head Attention uses trainable gating to enable the model to select tasks-related words or features.3)Built a copy generation summary model based on improved Transformer.For the problem of unregistered words(Out-of-Vocabulary,OOV),a pointer-generator network is introduced to build a hybrid model,and the probability is used to decide whether to select a word from a fixed vocabulary or copy a word from the original text as a model according to the distribution of attention weights.Output to alleviate the problem of unregistered words.Finally,through a series of experiments,this paper verifies that the improved model achieved better results on the English text summary data set Gigaword and the Chinese text summary data set LCSTS.
Keywords/Search Tags:text summarization, deep learning, Transformer, attention mechanism
PDF Full Text Request
Related items