With the rise of the concept of Artificial Intelligence Generated Content,language models such as ChatGPT have pioneered and led a new era of human-computer interaction,significantly enhancing the efficiency of information acquisition for users.But in real-life human-computer dialogue interactions,people have not only more precise and comprehensive information retrieval needs,but also a desire for smoother and more natural emotional interactions.Therefore,emotion-supported dialogue systems have gradually become one of the key research contents in natural language processing,which generally consists of three main processes:emotional dialogue semantic understanding,dialogue emotional state management,and emotional response generation.Among them,the multiperspective and deep-level emotional dialogue semantic understanding is the first step to build a more intelligent and humanized emotional dialogue system,it also serves as the basis and prerequisite for the subsequent two works.However,complex topic interaction,variable emotional dynamics,and blurred boundaries of emotional cause spans pose great challenges and opportunities for the semantic understanding of emotional dialogues.Moreover,previous works have largely ignored the semantic structure of natural language itself,which can provide assistance in understanding semantic content.To this end,this dissertation explores structure-enhanced semantic understanding of emotional dialogue from the perspectives of discourse structure,syntactic structure,and multi-level structure,respectively,as follows:(1)Discourse structure-enhanced dialogue topic segmentation.Previous works have mostly considered linear semantic interactions between adjacent utterances,ignoring the influence of nonlinear discourse structure on semantic content interaction in dialogues.To this end,this dissertation proposes a discourse structure-enhanced dialogue sequence composition method.Meanwhile,to address the issue of lacking prominent similarities and differences between utterance representations caused by relying on pre-training model encodings,this dissertation further models the above composition specifically based on contrastive learning on graph to accurately identify the boundary utterance of topic segmentations.The experiments on two public datasets show that the discourse structure can help the model to understand the interleaved and complex dialogue topology,the discourse structure-enhanced contrastive learning on graph modeling can help improve the adaptability and relevance of utterance representations to the topic segmentation task.(2)Syntactic structure-enhanced conversational emotion recognition.To tackle the issue of insufficient extraction of utterance’s deep implicit topic features in previous work,this dissertation constructs a syntactic structure-enhanced variational graph auto-encoders layer based on the dependent syntactic tree of each utterance.Furthermore,to fully model the dynamic interaction and linkage evolution between speaker,topic and emotion,based on graph attention networks,this dissertation proposes a dynamic interaction layer for speaker and topic perception.The experiments on two public datasets show that the syntactic structure can help the model to extract the implicit features of deep topics.Considering the individual speaker’s emotion variability,and modeling the dynamics jointly with topics,the model can effectively enhance the accurate prediction of conversational emotion categories.(3)Multi-level structure-enhanced conversational emotional cause span extraction.In order to solve the problems of utterance localization error and boundary recognition error,according to discourse-level coreferential structure,sentence-level dependent syntactic structure and dialogue content feature representation,this dissertation uses the graph attention network to construct and model dialogues at token level and utterance level respectively.Furthermore,to obtain the final structure-enhanced semantic representation for the extraction of causal spans,this dissertation promotes the interaction and fusion of the two levels through the biaffine mechanism.The experiments on two public datasets show that the coreferential structure at discourse level can help to enhance the model’s localization of the utterance in which the causal span is located,and the dependent syntactic structure at sentence level can help to enhance the model’s identification of the causal span boundaries.Meanwhile,this model and method can also be effectively compatible with utterance-level emotion cause entailment. |