Font Size: a A A

Research On Text Summarization Based On Deep Learning

Posted on:2024-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:C Y PangFull Text:PDF
GTID:2568307085964879Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
With the advent of the information age,people can access various types of information anytime and anywhere,including news,social media,political and economic developments,etc.The proliferation of this information makes it difficult for people to quickly and accurately obtain important information,thus requiring an effective technology to help people quickly summarize and understand information.Text summarization generation technology is an effective way to solve this problem.Text summarization generation technology refers to extracting key information from long texts and summarizing it into concise and clear summary texts.This technology can help people quickly understand the main points and key points of the article.With the development of deep learning and natural language processing technologies,text summarization generation technology has made significant progress in recent years.This article introduces techniques such as prompt learning and keyword extraction to help text summarization generation models better understand long text information and generate more accurate summary texts.Then,attention mechanisms and coverage mechanisms are introduced to improve the accuracy and readability of the summary generation.The following is the research conducted in this article.This paper proposes a summarization method based on keyword extraction and prompt learning to address the difficulty of fine-tuning existing pre-trained models and the problem of out-of-vocabulary words in generative summarization algorithms.Firstly,prompt learning is introduced by adding prompts to the input and transforming the downstream task into a text generation task.The input text is transformed by constructing artificial templates,extracting keywords from the transformed input text,and concatenating the extracted keywords with the transformed input text to construct a new input.Then,the TF-IDF algorithm is used to strengthen the model’s focus on the extracted keywords.Experiments on the CNN/DM dataset show that this model can effectively improve the quality of generated summaries,and the Rouge1,Rouge-2,and Rouge-L values are improved.Secondly,to address the issues of long-distance dependencies and repeated generations in summaries generated by existing models,attention and coverage mechanisms are introduced on top of the aforementioned model.The attention mechanism dynamically focuses on different positions of the input sequence at the encoding stage to better process long text information.Additionally,the coverage mechanism is combined to select previously generated vocabulary and determine the next word to be generated based on their frequency of occurrence in the text.The fusion of these two mechanisms can help the model generate more accurate and fluent summary text,avoiding the problems of repetition and omission of key information.The model achieved good experimental results on the publicly available datasets CNN/DM and Xsum,demonstrating its ability to effectively improve the quality of generated text summaries.
Keywords/Search Tags:Text summary, Prompt learning, Pre-trained model, BART, Keyword extraction
PDF Full Text Request
Related items