Research On Automatic Text Summarization Based On Self-Attention Mechanism

Posted on:2020-06-12

Degree:Master

Type:Thesis

Country:China

Candidate:D Y Zheng

Full Text:PDF

GTID:2428330620458480

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The purpose of the automatic text summarization technology is to extract a concise and easy-to-read summary from the source text through the algorithm model,so that people can obtain the information that they want from the massive text data more quickly.Automatic text summarization is a difficult and challenging task,and at present there is no well-recognized model for the summarization of long text.Due to the good modeling ability of the self-attention mechanism for the long-range dependencies on data in the sequence,and the capability of parallel computing,this thesis mainly studies the use of self-attention to construct a neural network,which is used later for automatic text summarization in experiment.The main research contents of this thesis are as follows:(1)This thesis proposes an LSAN(Lightweight Self-Attention Network)network based on the self-attention mechanism.The network is constructed because the encoder and decoder of the network only use one multi-head self-attention module.The main advantage of this network is that it can effectively model long range dependency of sequence by using self-attention module,and the network can be parallelized.Besides,compared with other self-attention network,the self-attention module of this network is sharing weight in the encoder and decoder,result that this network has less parameter and required less computing resource.Compared with the LSTM(Long Short-Term Memory)network,The experiments shows that the LSAN network has the improvement of 2.79 ROUGE-L scores with an increase of 10.6%,and the network can be parallelized.(2)Due to the LSAN network's insufficiency in extracting sequential features of the elements in the input sequence,this thesis proposes the LSAN-RPR(Lightweight Self-Attention Network with Relative Position Representation)network,which uses the self-attention that embedded the element's relative position representation in the sequence to enhance the model's extraction of sequential features of the elements in the input sequence,as shown by the experimental results where the LSAN-RPR network has improved the 1.2 ROUGE-L scores with an increase of 4.1% in comparison with the LSAN network.(3)Due to the LSAN-RPR network inability to generate the out-of-vocabulary words when used for automatic text summarization,this thesis proposes the LPSAN(Lightweight Pointer Self-Attention Network)network,which adds an attention layer to the LSAN-RPR network.When the decoder decodes an out-of-vocabulary word,the network can copy it from the source input text as the generated result based on the results of the attention layer.The experimental results show that the LPSAN network has an improvement when compared with other recent network.More importantly,the LPSAN network can be parallelized to make the model training more efficient.

Keywords/Search Tags:

Deep Learning, Natural Language Process, Self-Attention Mechanism, Automatic Text Summarization

PDF Full Text Request

Related items

1	Research On Automatic Text Summarization Algorithm For Chinese And English Long Text
2	Research And Application Of Text Summarization Model Based On Deep Learning
3	Automatic Summarization System Based On Natural Language Processing
4	Research On Automatic Text Summarization Generation Technology Based On Deep Learning
5	Research And Application Of Automatic Text Summarization Technology Based On Deep Learning
6	Research On Automatic Text Summarization Based On Deep Learning
7	Research On Key Techniques Of Two Phase Automatic Summarization Algorithm For Long Text
8	Research On Automatic Text Summarization Based On Deep Neural Networks
9	Research On Text Abstract Generation Method Based On Deep Neural Network
10	Research And Implementation Of Text Automatic Summarization Based On Deep Learning