Font Size: a A A

Research On Abstractive-Based Automatic Summarization Methods

Posted on:2020-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:X T WangFull Text:PDF
GTID:2428330590960628Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Automatic Summarization is the process of distilling the crucial information from a document to produce an abridged version for a particular task and user.Recently,automatic summarization-based applications like document abstract generation,headline generation,and answers to complex questions have been widely researched.Abstractive summarization models could generate condensed and concise summaries automatically that retain the salient information of a source text.However,generating a plausible and high-quality abstractive summary is a challenging task in practice since the computer lacks language capability and human prior knowledge to understand the entire document and then generate a condensed summary highlighting the main points of the input article.Recently,sequential-to-sequential model has achieved great improvements in the dialog system and neural machine translation areas,which provides a viable approach to abstractive summarization task.Whereas,these kinds of models have faced great challenges at the same time.Firstly,sequential-to-sequential based abstractive models perform randomly when generating summaries,which could not capture the key points or information from the original document.Secondly,from our observation,human-written summaries are much related to document categories.Different categories mean different point of key information where we should keep our eyes on.While sequential-based abstractive summarization methods tend to neglect the category information,which will lead to lack understanding of documents.Last but not least,abstractive summarization models try to retain the key information when decoding,but the readability of summaries is crucial,like repeated text,grammar mistakes,and fluency problems.These issues should be well solved because we human could easily acquire information when the sentences are well written.This thesis investigates new methods which could enhance the understanding of key points and information of documents,generating informative summaries,and improving the readability of generated summaries.We proposed two abstractive summarization methods,which we will illustrate as following,Firstly,we proposed a Generative Adversarial Network for Abstractive Text Summarization with Multi-task Constrained,GAN-ATSMT.Concretely,we design novel generative model and discriminative models.We train our summarization model with some subtasks,categories classification and syntax annotation tasks,under the framework of multi-task learning,which will improve the ability of category-information understanding and improve the readability of summaries.Secondly,we proposed that fusing external language model in abstractive summarization.Our proposed model learns knowledge and language information entailed in the external language model,which helps the internal neural language model improve the ability to generate high-quality summaries.The helper,external language model,who is good at language generation,to enable the internal langue model to focus its capacity on fusing different parts of semantic information in the source sentences since it can rely on the external language model for fluency.Through this approach,we could improve the quality of generated summaries both on semantically and syntactically by a large margin.
Keywords/Search Tags:Abstractive Summarization, Sequential to Sequential Learning, Generative Adversarial Neural Network, Multi-task Learning, Model Fusion
PDF Full Text Request
Related items