| Automatic summarization aims to automatically synopsize summaries that express the core information from the source document.In the information age,where massive amounts of data are emerging,it not only greatly saves people’s reading time,but also can be widely used in fields such as news aggregation,company reports,and scientific papers.Automatic summarization can be divided into two categories:extractive summarization and abstractive summarization.Compared with extractive summarization,abstractive summarization often comprehends the semantics of the source document before summarizing it,which is more similar to the way humans summarize.Also,the generated summaries are more readable and coherent,and have a wider range of application scenarios.Some achievements have been made in the quality of summaries generated by abstractive summarization models based on neural networks.However,some recent work shows that the summaries generated by these models often have factual errors,that is,the facts in the summaries are inconsistent with the source document.In order to alleviate this problem,this dissertation studies from three aspects:(1)Employing Internal and External Knowledge to Factuality-oriented Abstractive SummarizationGiven the problem that only internal factual errors or external factual errors are considered in previous studies,this dissertation considers the two as a whole and alleviates the two kinds of factual errors simultaneously by introducing internal and external knowledge.Firstly,external knowledge is introduced by ERNIE combined with knowledge graph,secondly,semantic role information in the source document is extracted as internal knowledge,and finally,this dissertation combines the two through interactive attention module to alleviate internal and external factual errors simultaneously.Experimental results on CNN/DM and XSUM datasets show that the proposed method outperforms several state-of-the-art benchmarks.(2)Adversarial Fine-grained Fact Graph for Factuality-oriented Abstractive SummarizationGiven the lack of linguistic structure cognition of factual errors in the existing research,this dissertation starts from the fine-grained categories of factual errors,analyzes the characteristics of different categories of factual errors,and adopts targeted strategies.Firstly,this dissertation analyzes the fine-grained classification of factual errors and focuses on the semantic frame errors and co-reference errors,which occupy the main part.Then,it establishes semantic frame fact graph and co-reference fact graph to represent the factual information in the source document or summary.Finally,it uses the generative adversarial network to integrate this factual information implicitly into the abstractive summarization model.Experimental results on CNN/DM and XSUM datasets show that the proposed method outperforms several state-of-the-art benchmarks.(3)Factual Instance Discrimination for Factuality-oriented Abstractive SummarizationSince the previous factuality-oriented abstractive summmarization models only consider the integration of factual information but ignore the mining of the causes of factual errors,this dissertation proposes a multi-task abstractive summarization model,which should judge whether the internal facts in the summary are correct while generating.Firstly,this dissertation constructs the factual summarization dataset with dependency relation as the factual instance through a series of data enhancement strategies.Then,through multitasking,the model not only generates the summary but also determines the correctness of the factual instance.Experimental results on CNN/DM and XSUM datasets show that the proposed method outperforms several state-of-the-art benchmarks.Through the above three methods,this dissertation alleviates the factual problems existing in the abstractive summarization task and improves the performance of the task to a certain extent. |