Font Size: a A A

Research And Application Of Multimodal Emotion Recognition Method Based On Video Data

Posted on:2023-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:J Y ZhangFull Text:PDF
GTID:2558306914977429Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Emotion is inherent in human beings,human communication and behavior are driven by emotion.With the development of communication technology and social networks,more and more Internet users post video information on various social media,and video data has dominated the traffic of the Internet.In recent years,due to the increasing amount of public video data in the network,emotion recognition in video has received extensive attention from industry and academia.How to recognize sentiment in video becomes an emerging hotspot in the current sentiment analysis field.This technology can be used to help recognize emotion in real-time conversations that occur on social media.Intelligent systems with emotion Intelligent can play a significant role in legal trials,interviews,e-health services,as well as in the operation and regulation of online video platforms,and hold great social and commercial promise.The research of video-based multimodal emotion recognition algorithm focuses on the following elements and implementation.1)A video emotion recognition model based on dual-waveform feature extraction module and attention mechanism is proposed.The model uses a waveform-attention module to capture emotion features from source and synthesized waveforms,and achieves fine-grained multimodal information fusion using the emotion efficacy coefficient mechanism.The model uses a novel dialogue emotion detection module to model emotion fluctuations.Extensive comparison and ablation experiments demonstrate the effectiveness of this model.2)A deep learning technology-based English speech ability assessment model is implemented,based on a novel dual-waveform conversational emotion recognition algorithm for multimodal data of text,speech and video,using a multi-core convolutional module with an attention mechanism,combined with an end-to-end language model to achieve speech anxiety analysis,grammatical error detection,and pronunciation standard degree of English learners.And a novel and challenging dataset for English speech ability assessment is constructed and labeled.Extensive comparison and ablation experiments demonstrate effectiveness and advanced performance of this model.3)A multimodal analysis system for English speech ability is developed,based on pre-trained deep learning models,the system integrates video data uploading,video key frame extraction,audio-video separation,video feature extraction,audio feature extraction,anxiety sentiment analysis,English speech ability evaluation,and characterspecific sentiment recognition,and detailed visualization and analysis of the algorithm’s results.
Keywords/Search Tags:Emotion recognition, Dialogue emotion, Feature extraction, Multimodal fusion
PDF Full Text Request
Related items