Dialogue system is a research hotspot in the field of human-computer interaction,which owns high use value and broad application development prospects.The dialogue with emotions can greatly improve the naturalness and fluency of human-computer interaction.The task of generating emotional dialogue is a very challenging task in the dialogue system,for that the plain text information is difficult to represent accurate emotional states and complex contexts.In terms of the multi-person dialogue scenarios,it is difficult to represent and model accurately through the traditional Seq2 Seq structure;The current research in the field of emotional dialogue generation mainly focus on the response and generation of the designated emotion is inconsistent with the real scene,and uses less information such as context and context;Most of the current models cannot fully perceive personality and context background,hence,many problems rise,such as inconsistencies in content,rigid emotional changes,lack of personality of the speaker and low quality of the generated responses.In response to the problems above,this article has carried out a number of research work:1.Introduce multi-source information based on heterogeneous graphs and use it to generate multi-party emotional dialogue for non-designated response.The graph structure is easier to represent the information flow of multi-party dialogue,as well as helps the model understand the context and contextual information.At present,the researches and applications on heterogeneous graphs are scarce in the field of emotional dialogue generation tasks.This article will introduce multi-source information including facial expressions and voice information into heterogeneous graphs and model multi-party dialogue scenes so that the model can understand the context information such as context and context and is able to handle multi-person dialogue scenarios without specifying the response emotion type.Finally,we expect to achieve accurate prediction on the emotions of different speakers and generate high-quality emotional responses.2.We design and propose a new Graph2 Sequence model.This paper uses multisource information such as dialogue history,voice,expression and personality as nodes,employing heterogeneous graph neural network to encode.At the same time,HGT is introduced and relative relationship coding is proposed to help the model dynamically identify different types of nodes and edges,meanwhile,the amount of parameters of the model is reduced to further help the model fully understand the context and speaker’s personality to predict the speaker’s emotions accurately.In addition,this thesis introduces the attention mechanism to extract contextual information of heterogeneous graphs,and proposes a personality-affected emotional expression modeling method.When combined with the improved GRU dynamic,it will generate emotional dialogues that are not only more contextual,but also more in line with the speaker’s personality.In this thesis,MELD and Daily Dialog are used as data sets to combine with multiple evaluation indicators,and a sufficient comparison experiment is carried out.The experimental results demonstrate that the model can predict the speaker’s emotions accurately,and the quality of the generated responses in fluency,relevance,diversity and Emotional correlation and accuracy are superior to traditional models in many aspects.3.What’s more,we designed and implemented a multi-modal emotional dialogue system.Most of the current dialogue systems only support text-based dialogues,neither multi-person dialogue scenarios nor emotional dialogues.In response to the above problems,this system combines voiceprint recognition,facial feature extraction,voice recognition and other technologies together,supports the input of dialogue videos,automatically analyzes and recognizes user identity,extracts and encodes facial expressions and voice information features.Besides,this system combined with the model proposed in this article to predict the appropriate emotions and respond to the appropriate emotional response.Through experiments and tests,the system could achieve the expected results and had good scalability. |