Font Size: a A A

Research And Application Of Text Summarization Algorithm Based On Multi-task Attention Mechanism

Posted on:2022-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:X M QiuFull Text:PDF
GTID:2518306725492974Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development and popularization of Internet technology,there has been an explosive growth in text-type information such as news,legal documents,scientific papers,and policy documents.Enterprises,institutions,and individuals need to spend a lot of time and energy obtaining the key information in these text data.Therefore,proposing efficient and accurate text summarization technology is an important method to solve these problems.The text summarization technology can be divided into extractive summarization and abstract summarization.Among them,extractive summarization directly extracts sentences from the original text and sorts them by importance to form the final summary.Based on the semantic understanding of the original text,abstract summarization compresses and abstracts the original text information,and generates abstract content with new vocabulary and multiple description styles.Each of these two methods has its advantages and limitations.Although extractive summarization can extract important sentences in the original text,the extracted sentences cannot completely cover the original content due to the limitation of the length of the abstract.In contrast,abstract summarization is more intuitively in line with human abstract writing habits,can generate words that are not in the original text,and are more flexible.However,its description is prone to factual errors and weak readability.To solve the above problems,this paper optimizes and improves the existing extractive and abstract summarization algorithms respectively based on multi-task learning and attention mechanism.At the same time,a set of policy interpretation prototype system based on multi-task learning and extractive text summarization technology in the field of policy essential extraction was designed and implemented.The main research work and contributions of this paper are as follows:(1)There are some problems of insufficient importance of the abstract content and incomplete coverage of the original content in the existing extractive summarization algorithms.This paper starts from the granularity of the sentence and comprehensively considers the information of the word part of speech,sentence syntax,and the topic of the text,and proposes multi-task learning based Extractive summarization algorithm—Ext Summa MTL.(2)To solve the problems that the existing generative text summarization mainly lacks the corresponding background knowledge information and the generated abstract deviates from the meaning of the original text,this paper proposes a generative text summarization algorithm based on the multi-task attention mechanism—Abs Summa MTL.The algorithm uses multiple data sets to enhance text background knowledge information.At the same time,a cross-task attention mechanism is proposed to perform a weighted summation of the importance of different auxiliary tasks in the abstract from different moments of semantic coding,to improve the ability to generate the main task of the text abstract.Also,the Pointer mechanism and the Coverage mechanism are used to solve the Out-of-Vocabulary problem and the repetitive generation problem.(3)To solve the problem of the lack of existing Chinese extractive abstract data sets,this thesis constructs a policy-oriented Chinese extractive text summary data set,which enriches the types and quantities of Chinese extractive text summarization algorithm data sets.(4)Using the policy data set constructed in this article,this article designs and implements a set of Chinese extractive text summarization algorithms and systems for the field of policy interpretation,which can extract important views and policy requirements from the long-length policy text,which is useful for improving the comprehensibility and dissemination of the published policies.
Keywords/Search Tags:Text summary, multi-task learning, attention mechanism, policy interpretation
PDF Full Text Request
Related items