Font Size: a A A

The Study Of Dialogue Summarization Based On Knowledge Enhancement

Posted on:2024-06-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:L L ZhaoFull Text:PDF
GTID:1528306944956559Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As one of the most important tasks in the field of natural language processing,text summarization aims to compress a long and complex text into a concise representation.In the era of artificial intelligence,this technology is not only an effective means for people to quickly obtain core information from big data on Internet,but also provides important support for the development of various other intelligent systems.In recent years,as the main form of communication,dialogue,has gradually occupied people’s lives,summarizing the content of the dialogue,i.e.,dialogue summarization,has attracted more and more researchers’ attention.Considering that most of the existing researches on text summarization focus on monolingual documents,they typically illustrate core ideas from a third-person perspective,and the information flow is expressed very clearly through paragraphs or chapters.Unlike these structured documents,dialogues are often informal,lengthy,and repetitive,with issues such as incorrect starting,reverse guidance,reconfirmation,hesitation,and speaker interruptions,and the important information is scattered throughout the dialogue,making it difficult for the current summarization models to focus on multiple informative utterances.Moreover,informal language,abbreviations,and emoticons all propose new forms of challenges for dialogue summarization.This thesis focuses on the task of abstractive dialogue summarization,using topic knowledge,structure knowledge,fact knowledge,dialogue state knowledge,and domain knowledge to study some key issues in chit-chat scenarios,meeting scenarios,and task-oriented dialogue scenarios,which has achieved results with certain theoretical significance and application value.The research content of this thesis includes:Topic Knowledge Enhanced Dialogue Summarization Generation:This thesis proposes a topic-word guided attention model for dialogue graphs and designs a topic-word enhanced graph-to-sequence network via the topic knowledge in the dialogue.For this model,the pointer-generator network not only generates new tokens from the vocabulary list,but also copies existing tokens from the original dialogue,which solves the problem of Out Of Vocabulary(OOV)to some extent.In this way,discussions on different topics can be more easily organized to form corresponding information flows.Experimental results show that this method improves the focus of the model on topic knowledge,as well as the fluency and informativeness of generated summaries.Structure Knowledge Enhanced Dialogue Representation:This thesis proposes a multi-interactive guided dual-copy knowledge network.Using the knowledge of the speakeraware structure can simulate human communication processes and capture cross-sentence dependency relations.Constructing a fact graph by mining the fact-aware structure of a dialogue can help encode the existing fact descriptions into a summarization system.Based on these two types of structure knowledge,it is possible to break the logic of people’s sequential thinking to generate accurate event descriptions.Experimental results demonstrate that this method can effectively model and represent dialogue content.Fact Knowledge Enhanced Factual Consistency Research:This thesis proposes a semantic-slot guided adversarial sequence-to-sequence network.This method designs a slot-level attention mechanism by copying corresponding slot values from the heterogeneous semantic slot graph and proposes a slot-driven beam search algorithm to prioritize the generation of salient elements in a controlled manner.The experimental results verify that the model can effectively integrate the fragmented information of events and generate more faithful summaries in a controllable manner.Dialogue State Knowledge Enhanced Task-oriented Dialogue Summarization Dataset Construction:This thesis constructs a task-oriented dialogue summarization dataset,TODSum,with the corresponding dialogue state knowledge,and established a comprehensive benchmark.Here,this thesis uses a large-scale automatic annotation assisted by a smallscale manual annotation,which iteratively improves the quality of the dataset.In addition,a state-aware factual consistency metric is proposed.Detailed qualitative analysis and experiments have demonstrated that TODSum is a high-quality dataset and the effectiveness of dialogue state information for task-oriented dialogue summarization.Domain Knowledge Enhanced Zero-shot Dialogue Summarization Exploration:This thesis proposes a domain-oriented prefix-tuning method and an adversarial disentangled prompt learning method to take the lead in exploring fine-tuning methods in domain adaptation for dialogue summarization.Treating the domain knowledge as prompts can interact with large-scale pre-trained language models to promote the disentanglement of different domain knowledge.In addition,two practical and comprehensive benchmarks have been established for the TODSum dataset and the QMSum dataset.The experimental results show that the above methods not only have good domain generalization ability but also have high robustness.
Keywords/Search Tags:natural language processing, dialogue summarization, topic knowledge, structure knowledge, fact knowledge, dialogue state knowledge, domain knowledge
PDF Full Text Request
Related items