| Automatic text summarization plays a big role in helping people to retrieve effective information rapidly,in the current era with rapid development of the Internet and information overload.The exsiting text summarization technology generally can be divided into extractive and abstrative summarization.This paper mainly studies the abstractive summarization with greater difficulty and better quality of generated summarization.Most existing abstractive summarization is based on the seq2 seq structure with attention machanism,it generally can produce short summarization with acceptable quality,however,this method still has some shortcomings,such as in the generation of sightly longer summarization,it is easy to appear the repeatition of words and phrases,the resulting summarization sometimes appears to be poor in grammar and semantical.To solve these problems,we creatively propose a decoder-pointer network structure based on the basic seq2 seq with attention,that is,an addtional training on the pointer network,the role of which is to copy a proper word to output from the position of the raw text in the current time step.Actually,this replication machanism can be automatically trained to learn through the adjustment and optimization of parameters.On the other hand,we propose a coverage detection machanism to solve the situation where the resulting summarization tends to have duplicate words and phrases.Its main idea is to introduce the sum of the attention weights distribution of the previous time step as the coverage vector,by inhibiting the value of one dimension attention distribution is particularly large,and by adding a coverage loss function to the objective loss function.It cleverly solves the problem that attention machanism focusing on only a few words,which greatly relieves the occurrence of duplicate words or phrases in the production summarization.Finally,because the structure that GAN network allows the generator network and the discriminator network to evolve and progress together in the process of confrontation,and finally let the output of the generator network be very close to the real sample.Based on this,we use the improved seq2 seq network as the generator network,and introduce the word2 vec plus the CNN architecture of the abstract text classifier as the discriminator network,the two are constantly and iteratively confrontational and can produce the final generation of high-quality summarization.In order to verify the validity and superiority of the fusion seq2 seq and GAN network proposed in this paper,we selected the experiment of generating summarization on three open resource standard datasets:Gigaword,DUC2003,CNN/Daily Mail,and selected ROUGE-1,ROUGE-2,ROUGE-L as the evaluation metric.The experimental results show that our model has different improments on three datasets based on these three indicators of summarization quality metric. |