Emotion Understanding And Generation With Multimodal Correlation

Posted on:2022-03-04

Degree:Doctor

Type:Dissertation

Country:China

Candidate:G Y Shen

Full Text:PDF

GTID:1488306746457694

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the development of multimedia technology,the modalities of network data have become increasingly complex and expressive.On the other hand,research in the field of artificial intelligence has made great progress in imitating human physiology,while research on the psychological level is in the ascendant and has a lot of room for development.If machines can accurately understand and express emotions in multimodal data,not only will it have significant interdisciplinary research significance between psychology and computer science,but it will also greatly improve the user understanding and content creation in practical applications.At present,related work has defects such as insufficient dataset universality,low relevance of multimodal emotional feature,and weak robustness and interpretability of multimodal fusion and generation,which cannot meet the needs of complex environments in reality.Therefore,this thesis aims to research on emotion understanding and generation with multimodal correlation.In this thesis,we establish multiple standard datasets and benchmarks,extract multimodal emotion features,and construct machine learning models under the guidance of psychology.In this way,we capture and enhance the correlation between modalities and achieve better recognition ability for user emotions,robust reasoning ability of content emotions,and stronger interpretability in emotion data generation,providing a research foundation and future directions in related fields.The main contributions of this thesis are summarized as follows:� We propose a multimodal user depression recognition method on social media.Inorder to capture the multimodal correlations under the social media big data and im-prove the understanding of user emotions,this thesis constructs a high-quality large-scale depression user dataset on social media and extracts multimodal features thatare highly related to emotions under the guidance of psychological theory.We usemultimodal dictionary learning algorithm to obtain the multimodal joint sparse rep-resentation for further emotion classification,which achieves a good performancein depression recognition.In addition,we excavates some typical online depressionbehaviors on large-scale depression users.� We propose a multimodal human emotion reasoning method based on video data.In order to capture and enhance the correlation between modalities when the modalinformation is incomplete and improve the robustness of emotion understandingof contents,this thesis constructs a large dataset for multimodal human emotionreasoning in videos,which provides the person-level emotion annotations in thescenario of modality absence.This thesis proposes a multimodal emotional reason-ing model based on the self-attention mechanism.While performing multimodalfusion,it also considers reasoning strategies such as emotional communication andemotional context,achieving good performance on this data set.This work wouldsupport the development of more advanced emotion reasoning algorithms.� We propose a controllable expression video generation method,where the intensityof frames are self-inferred.In order to take into account the robustness,controllabil-ity and interpretability in emotion generation,this thesis proposes an intensity-basedvideo generation method,which generates expression videos from a single neutralface to capture the multimodal correlation between images and videos.The high-light is that in training the expression intensities of each frame can be automaticallyinferred,so that we can avoid complicated and inaccurate manual intensity label-ing.In addition,we provide a unified generation model of multiple expressions forusers.This method can provide convenience for public creation and enhance theoverall vitality of the Internet.

Keywords/Search Tags:

Affective Computing, Multimodal, Video Processing, Social Media, Machine Learning

PDF Full Text Request

Related items

1	Data-driven And Knowledge-guided Video Affective Computing
2	Research On Affective Computing Based On Multimodal Fusion
3	Affective And Behavioral Computing For Personalized E-Learning
4	Research On User Activity Of Short Video Social Media Based On Machine Learning
5	Research On Machine Multimodal Perception
6	Research And Build Of Multimodal Affective Database
7	Research On The Audience Oriented Affective Content Representation And Recognition Methods In Film
8	Design And Research Of Wearable Affective Computing Devicefor Online Learning
9	Research On Affective Computing Based On Human Behavior
10	Research On Modeling Method Of Affective Dialogue Management And Its Application