Font Size: a A A

Research On Abstractive Summarization Method Based On Deep Learning

Posted on:2022-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y R WangFull Text:PDF
GTID:2518306788956879Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
The research goal of natural language processing is to realize the understanding of natural language.This "understanding" is applied to many fields,such as: question and answer task,reading comprehension task,text summarization task,etc.Text summarization task refers to the simplified summarization based on the "understanding" of the source text.When readers read a large amount of text content,this task can help readers reduce the amount of reading and improve the reading efficiency,Therefore,the research on the text summarization task is of great significance.The methods of obtaining text summarization include extraction and abstraction.Extractive summarization refers to extracting important content of the text and splicing it into a summarization.However,if the extraction of text features is not comprehensive,the key information will be lost.The abstractive summarization can generate a summarization combined with the context content,which makes the summarization richer,but the wrong content may appear due to the influence of the irrelevant information in the original text.This paper mainly studies the abstraction model which is close to the manual summarization method,reinforcement learning is used to combine the extraction method into the abstractive model,which retains the advantage that abstraction can consider the content of the full text and uses the advantage that extraction can filter the interference information in the text to solve the problems of abstractive summarization.However,reinforcement learning models often only consider Rouge as the feedback value,which only considers the degree of vocabulary matching and lack attention to the content.The research content of this paper mainly includes the following two aspects:(1)Aiming at the problem of losing important information in the extraction part,this paper proposes to splice the feature representation of the source text content and the feature representation obtained by BI-LSTM,so that the information is no longer selected,but all enter the next layer of model structure,which can retain the important content of the text to the greatest extent.Adding self-attention mechanism after feature stitching can pay attention to the semantic dependency between text contents and retain more important features.The experimental results show that when feature stitching is performed and self-attention mechanism is added,the generated summarization is more diverse and closer to the reference summarization.(2)For the reinforcement learning model,when Rouge is used as the feedback value,only the vocabulary matching between the summarization results and the standard summarization is considered,but the text content similarity is not considered.In this paper,the evaluation index BERTScore is optimized as the feedback value,and the evaluation index Rouge is combined with BERTScore to take into account the vocabulary matching and content similarity.The results of relevant experiments and comparative experiments show that the text summarization obtained by this improved scheme not only has higher vocabulary matching,but also has more similar content than the reference summarization.Finally,based on the research,this paper applies the theoretical method to the actual operation and develops a text summarization system.The text summarization module completes the task of summarizing the important contents of the whole article.
Keywords/Search Tags:text summarization, extraction, abstraction, text feature, deep learning
PDF Full Text Request
Related items