Font Size: a A A

Research On Chinese Text Automatic Summarization Method Based On Contrastive Learning

Posted on:2024-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y W ZhouFull Text:PDF
GTID:2568307157983519Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:
In today’s society,people face a flood of information,and information overload has become the norm.It is difficult for readers to spend enough time to read all the content,so dimensionality reduction of text data has become very necessary.As an important technology in natural language processing,automatic text summarization can be used to analyze and process large amounts of text data,automatically generate accurate and concise summaries by extracting important information from text data,and effectively improve reading efficiency,reducing the problem of information overload.With the continuous development of deep learning technology,automatic text summarization has made great progress and improvement,but the current summarization models still face problems such as exposure bias,insufficient text feature extraction capabilities,and long-distance dependencies.To address these issues,this paper studies and improves existing text summarization techniques,with the main content as follows:Firstly,we propose a summarization generation method that combines Transformer and TCN.In view of the shortcomings of traditional neural network models in extracting text features,we add temporal convolutional networks to the encoder of the Transformer model to obtain more local semantic relationships in the input sequence and strengthen the capture of positional information,so that the model can generate more comprehensive summaries.Experimental results on the LCSTS dataset show that the proposed model improves the quality of generated summaries.Secondly,we propose a summarization generation method that combines extractive and abstractive approaches.To improve the model’s ability to process long texts,we use a technique that extracts summary sentences before generating summaries.We first use Bert to generate vector representations of sentences,then use an improved Text Rank algorithm to extract key sentences,and finally input those into the Unilm model for summary generation.We conducted experiments and comparisons on the NLPCC2017 dataset,and the results show that our model has improved over other models in terms of ROUGE metrics,verifying the effectiveness of our proposed method.Finally,we adopt contrastive learning to improve the performance of existing summarization generation models.Existing summarization generation methods usually use sequence-to-sequence models and maximum likelihood estimation for training,but these methods often suffer from exposure bias.To alleviate exposure bias,a new method is proposed to use beam search to generate multiple candidate summaries on the decoding side of existing summarization generation models,and then train a summarization evaluation model using contrastive learning to evaluate the candidate summaries and determine the final summary.Experimental results show that this method can improve the performance of existing models.
Keywords/Search Tags:Text summarization, Transformer, Temporal convolutional network, Unilm, Contrastive learning
Related items