With the development of technologies such as big data,artificial intelligence,and cloud computing,massive amounts of text data have accumulated on the Internet.In order to extract key information,text summarization technology emerged as the times require.Generally speaking,the short summary of the generated text must meet the requirements of sufficient information,can reflect the main content of the original text,and have low redundancy.However,the current generative text summarization technology has problems such as low generation quality and unsmooth sentences.Therefore,in view of the shortcomings of current text summarization generation,an improved Pointer Generation Network(PGN)model is proposed to improve the quality of generative text summarization.The research content and innovative work carried out are as follows:(1)Aiming at the problems that PGN is easy to generate wrong words,missing words and repeated words,an improved pre-extractor module based on BERT model is proposed,and this module is introduced into PGN.The improved model can effectively alleviate the defect of partial information preference by adding the BERT module to improve the dependency relationship between text information.(2)Aiming at the problem that the autoregressive model is prone to gradient disappearance and gradient explosion for generation tasks,a dual-channel pre-fetching encoding mode is designed.Using BERT-based improved self-encoding method combined with dilated convolution to form a dual-channel pre-extractor,parallel processing of input data,reducing the cumulative impact of errors while enhancing the ability to extract text information,and finally achieve the purpose of optimizing the performance of the PGN model.The experimental results show that the improved model based on PGN is superior to the original PGN model in terms of Rouge 1,Rouge 2,and Rouge L values.Compared with the PGN model directly added to the pre-trained BERT,the training time is shortened by about 24%.(3)To solve the problem of insufficient decoding ability of the original PGN model,the LSTM model is replaced by the ON-LSTM model in the decoding stage of the pointer generation network to extract the hierarchical information in the sentence.And in the decoding stage,the multi-head attention mechanism is integrated to further strengthen the ability to decode text information.Finally,the Flooding technique is used in the loss function,so that the model can jump out of the local optimum,so as to find better model parameters.The experimental results show that the text summary generated by the improved PGN has significantly reduced the number of wrong words and missing words,and the Rouge 1 value can reach 34.6.In summary,this article has conducted a series of explorations on generative text summarization technology,providing an effective technical approach for pointer generation networks to address the issue of errors and omissions.It has certain potential value for the practical application of generative text summarization. |