Font Size: a A A

Research On Humor Recognition Based On Multimodal Fusion

Posted on:2022-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:J M WuFull Text:PDF
GTID:2518306509984629Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Humor is a special expression in human communication.It can create a relaxed atmosphere and promote communication between people.Humor is full of wisdom and creativity,studying the mechanism of humor,using computers to model humor,identify humor and generate humor helps computers to simulate human cognition,which is very important to the development of artificial intelligence.In recent years,there have been many researches on text-based humor recognition,but with the development of social media,the object of humor recognition is no longer limited to text.Multimodal information such as audio and video also contain plenty of humor.Multimodal humor recognition has gradually become a hot topic in this research area.It needs to extract the independent information within a single modality and interactive information between different modalities,which is a very challenging task.This thesis focuses on humor recognition based on the fusion of three modalities information of text,audio and video.First of all,for insufficient of public datasets of multimodal humor recognition,we construct a multi-lingual multimodal humor recognition dataset.This thesis introduces the process of data collection,data processing and data annotation in detail,and calculate the consistency of annotation process.The data analysis part shows the main statistical information and data distribution of this dataset.This thesis compares this dataset with the existing multimodal humor dataset,and prospects the application direction of this dataset.The experiments in the subsequent chapters will be carried out on the public dataset and self-made dataset.Secondly,we propose a multimodal fusion method based on attention mechanism.In this model,different neural network structures are designed for intra-sentence modeling to obtain monomodal sentence representations on different modalities.Then,the hierarchical attention mechanism is used to encode the feature sequences of the two modalities.This model learns the contextual information of the context and the interaction information between modalities.This method can better integrate multimodal information in paragraphs containing context,and achieves results that exceed the baseline method.The ablation experiment proves that the fusion of multimodal features and the introduction of context have improved the results of humor recognition.Finally,we use multi-task learning method for multi-modal fusion by regarding different modal information as different tasks.This method applies modal independent network to extract single-modal internal features,and designs a parameter sharing module to learn interactive information between different modalities.This method can obtain a good recognition results while the model parameters are limited.In addition,this thesis also carries out humor and emotion multi-task learning on the self-made dataset.The model fuses multimodal information through a parameters sharing layer,and then use the task's unique attention module to integrate context information to make the respective classification tasks of emotion and humor.Experimental results show that multi-task learning can improve the effect of humor recognition more than directly introducing emotional features.
Keywords/Search Tags:Humor Detection, Multimodal, Attention Mechanism, Multi-task Learning
PDF Full Text Request
Related items