Font Size: a A A

Multimodal Emotion Recognition Based On S-ELM-LUPI Paradigm

Posted on:2019-04-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:MICHELEFull Text:PDF
GTID:1318330542953259Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,Multimodal Emotion Recognition has emerged as one of the new research domains in the field of Human-Computer(Man-Machine)Interaction.The abundance of information and smart computing devices in the present digital era has necessitated the need for a more natural-like interaction between these devices and humans(principal user).The rapid evolution of technologies has seen a subsequent development of new devices capable of exhibiting social-like features.Among such devices are those capable of characterizing emotion recognition especially in social-related roles.Automatic Emotion Recognition using multiple modalities presents an enhancement of emotion recognition.However,the combination of multiple features makes the learning process more complex and difficult to apply in real-time problems.Certainly,the increment in the number of features introduces the curse of dimensionality problem and that coupled with execution time augmentation.In addition,social related learning problems,especially in emotion recognition,require several features because of human individualism consequently resulting in consumption of large storage memory.This paper focuses on feature extraction and data fusion for recognition enhancement and storage memory in audio-visual based emotion recognition.The paper endeavors to answer current issues associated with multimodal information fusion not limited to:1)which exact clues from a human being tend convey the most informative features about affective states.2)Which is the best way to combine modalities in order to get the best performance in the recognition process?3)Which approach reduces recognition complexity the best way?4)How do we effectively enhance the recognition rate and save storage memory consumption?The main contributions and innovations of this paper are stated as follows:(1)The work proposes a method named Sparse Extreme Learning Machine-Learning Using Privileged Information.It inherits the generalized high speed of the Extreme Learning Machine.It combines this with the acceleration in the recognition process by the Learning Using Privileged Information and the memory saving of the Sparse Extreme Learning Machine.It is a learning method,which improves the traditional learning methods of examples and targets only.It introduces the role of a teacher in providing additional information to enhance the recognition(test)without complicating the learning process.The proposed method is tested on publicly available datasets and yields promising results.(2)The Unimodal Audio and Facial Emotion Recognition using multiple features using semi-serial fusion method is proposed.The study analyses the impact of the feature combinations in enhancement of the recognition enhancement.Using the Sparse Extreme Learning Machine-Learning Using Privileged Information method,the work proposes the study of learning using one modality yet employing multiple feature types.Multiple methods of feature extraction are exploited to obtain different and complementary types of features.The results show that the Learning Using Privileged Information in unimodal case is effective when the size of the feature is considerable.In comparison to other methods using one type of features or combining them in a concatenated way,this new method outperforms others in recognition accuracy,execution reduction,and stability.(3)The present work proposes a new fusion method of audio-visual modalities for emotion recognition by applying the semi-serial fusion method principle and exploits the Learning Using Privileged Information method.This method considers one modality as standard information source and the other modality serves as privileged information source.The obtained results show that the proposed method is applicable in Multimodal Emotion Recognition.The execution time is reduced to less than a millisecond for hundreds of samples.The sparsity of the proposed method gives it the advantage of an economized storage memory implementation.In comparison to other machine learning methods,results show that the proposed method is more accurate and stable than others.Lastly,this work proposes fusion based on Incremental Integration of Learning Using Privileged Information paradigm and Sparse Extreme Learning Machine.The new semi-serial combination in unimodal and multimodal of audio-visual based Emotion Recognition is studied to assess the simultaneous enhancement of recognition rate and the reduction of storage memory.Experiments results show that in comparison to serial fusion methods,the proposed method reduces the storage memory requirements and presents a simplified learning method.This make the proposed method applicable in Realtime and real-life problems.
Keywords/Search Tags:Multimodal Emotion Recognition, Sparse Extreme Learning Machine, Learning Using Privileged Information, Human Computer Interaction, Neural Networks
PDF Full Text Request
Related items