Research And Design Of News Text Summarization Based On Generative Adversarial Net

Posted on:2022-01-09

Degree:Master

Type:Thesis

Country:China

Candidate:R Hua

Full Text:PDF

GTID:2518306341982299

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

As the mobile Internet enters people’s lives,a large amount of news text information is generated.How to extract important contents from the massive news text information accurately and efficiently has become an urgent need in the industry.Automatic text summarization technology can ensure the integrity of information while reducing the volume of text,thus improving the efficiency of accessing information.In the field of generating summaries,the most used model is the sequence-to-sequence model(Seq2Seq).Due to the flexibility of text summarization and the limitation of network structure,the generated summaries are prone to word repetition,short sentences,grammatical errors and inconsistency with the meaning of the original text.In this paper,we propose a method of news text summarization based on generative adversarial networks for extracting the summary information of Chinese and English news texts and improving the quality of generated summaries to address the existing problems,and the main work points are as follows.1)The overall architecture of the text summarization algorithm based on generative adversarial networks-Seq2Seq generator model and a two-input two-loss function discriminator model based on pre-trained language model are proposed.2)Improving the input and output of the generator.In this paper,we propose to add word boundary information to the word-level input and introduce prior knowledge.This paper proposes dynamic weighted softmax output to reduce the accumulation of model errors and the possibility of multiple consecutive repetitive nonsense words.3)Propose a two-input,two-loss function discriminator model based on a pre-trained language model,and supervise the Seq2Seq generator model to generate summaries with high readability and the same meaning as the original text.4)Evaluate the model effects on the Chinese data set LCSTS and the English data set Gigaword.The evaluation metrics are the ROUGE series values,the statistical values of the generated abstract length,and the manual ratings of the generated abstract quality.The experimental results show that the methods proposed in this paper are all effective in improving the ROUGE values,and the ROUGE-1 and ROUGE-2 metrics are higher than the existing models.The current LCSTS data set ROUGE-1 is 39.22 and ROUGE-2 is 26.10;the Gigaword data set ROUGE-1 is 37.10 and ROUGE-2 is 18.13.Comparing the Transformer-based Seq2Seq model,the statistical values of the generated abstract lengths are all more similar to the statistical values associated with the reference abstracts.The manual scoring has a significant improvement,where the number of abstracts scoring 3 to 5 is improved by 27.67%on the LCSTS data set and 11.95%on the Gigaword data set.The above experimental results demonstrate that the proposed methods are effective in improving the ROUGE evaluation index and can solve the problems of word repetition,short sentences,grammatical errors and inconsistency with the meaning of the original text to a certain extent,and improve the quality of abstracts.

Keywords/Search Tags:

text summarization, generative adversarial net, a dual input dual loss function discriminator, dynamic weighted softmax output, word boundary information input

PDF Full Text Request

Related items

1	Research On Adversarial Attack On Steganography Based On Dual-discriminator Generative Adversarial Network
2	Generate Images Based On Text Generated By The Adversarial Networkk
3	Research On The Input And Output System Of Dual-band Gyro-TWT
4	Research On Image Deblurring Method Based On Dual Discriminator Conditional Generative Adversarial Networks
5	Boundary Control Design Of A Flexible Rotatable Manipulator With Input-Output Constraints
6	Control Design Of 2DOF Servo Cloud-platform System Based On Dual-input Dual-output System Identification
7	The Research Of Digitally Assisted Dual-Input Doherty Power Amplifier
8	Control Design For Flexible Robotic Manipulator With Input And Output Constraints
9	Research On Text Generation Method Based On Improved Generative Adversarial Network
10	Research On Quasi-Optical Electron Cyclotron Maser With Dual Confocal Waveguide