Font Size: a A A

Research And Application Of Factual Correctness Technology For Automatic Text Summarization Based On Deep Learning

Posted on:2022-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:H T ZhouFull Text:PDF
GTID:2518306764976969Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Summarization models can help employees summarize dialogue records in an enterprise's customer complaint management scenario.Even if there are many complaints,employees can quickly transfer the problems to the relevant departments in the enterprise and get solutions based on the summaries generated by the summarization model.This approach aims to provide timely feedback to customer complaints and increase customer satisfaction.However,model-generated abstracts result in inconsistencies between the facts of the generated summary and the original text.This problem makes the generated summaries unreliable.This thesis studies the factual consistency of dialogue summaries for application value and correction during model summary generation.This thesis also applies the research results to an enterprise management software to realize its practical research value.The main works are as follows:(1)Existing studies of model-generated summaries mainly focused on English datasets.Research on Chinese datasets,especially dialogue summary datasets,is very sparse.We first train several baseline models on the Chinese dialogue summary dataset to obtain the summaries generated by these models.Then we analyze and classify the factual errors in these summaries separately and produce statistics of the different types of errors.Finally,we create negative samples from the dataset based on the the different error types.(2)Existing summarization models are limited in their ability to effectively detect facts in original texts.Hence,we propose an encoder-based contrastive loss function.According to the proposed loss function and negative samples,the model can then distinguish the correct facts in comparing positive and negative samples and thus know the facts described in the original text.After obtaining the fact-aware dialog text representation by the encoder,this thesis proposes a decoding contrast loss function and a self-supervised task to reduce the probability of generating factual error summaries.With the contrast loss of decoding,the model can maximize the probability of generating reference summaries and minimize negative samples when generating summaries.Also,the self-supervised task enables the model to determine which speaker the tokens extracted from the input dialogue are from,thus allowing the model to track the speaker's information.Through comparison experiments and ablation experiments,the results show that the contrast loss function with the self-supervised task can effectively reduce the factual errors in the model-generated summaries,thus proving the method's effectiveness in this thesis.(3)To address the difficulty of applying text summarization techniques in professional fields,this thesis uses an unsupervised model to produce a dialogue summarization dataset for business management scenarios and then applies transfer learning to the resulting dataset.The experimental results verified the validity of the transfer learning model.To realize the practical application value of this thesis,we developed a dialogue summarization service.We placed the fact summarization transfer model at the core of this service to summarize customer complaints.Finally,the dialogue summarization service is integrated into a business management software to complete the application demonstration.
Keywords/Search Tags:Deep Learning, Text Summarization, Factual Consistency, Contrastive Learning, Model Migration
PDF Full Text Request
Related items