Application Research Of Emotion Recognition Based On Deep Learning

Posted on:2021-01-17

Degree:Master

Type:Thesis

Country:China

Candidate:W J Li

Full Text:PDF

GTID:2518306470960929

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Emotion recognition is an important research topic in the field of human-computer interaction.It can play an important role in the fields of education,medical treatment,safe driving,game development,and so on.Facial expressions and speech are the two most important parts of human emotion expression,with facial expressions accounting for 55% and speech accounting for 38%.Early emotion recognition is mainly to extract features designed by human,and then use traditional machine learning methods to recognize.However,with the development of computer technology,the requirements for recognition accuracy and robustness have increased,and traditional machine learning methods have shown their limitations.In recent years,deep learning has shown excellent performance in various fields,and most of the current emotion recognition research is based on deep learning.Emotion recognition based on deep learning usually uses ordinary convolutional neural networks(CNN),but ordinary CNN model has too many parameters,and does not consider the sparse characteristics of emotions.The degree of emotional information contributed by different parts of facial expressions is different,and the degree of emotional information contributed by speech signals at different time periods is also different,so the traditional CNN is not efficient.At present,there are mainly single-mode and multi-modal emotion recognition,but the existing multi-modal emotion database is basically recorded under ideal conditions in the laboratory,and is not suitable for emotion recognition applications in real-world scenarios,and multi-modal emotion recognition model is usually very large,which makes the recognition time-consuming too long.It is not suitable for building a real-time emotion recognition system,nor for applying to lower-level computers.This article mainly focuses on single-modal emotion recognition,researching facial expressions and speech emotion recognition separately.The main contents of the work are as follows:(1)For facial expression recognition,in order to solve the problem that the amount of ordinary CNN parameters is too large and it is difficult to pay attention to the different contributions of emotional information in different parts of human facial expressions,this paper proposes the SE-Mini-Xception model,which is based on the original Xception,by trimming the number of network layers,and then combined with the attention module(SE block),a lightweight convolutional neural network model with attention mechanism is obtained.SE-Mini-Xception was verified on the public real facial expression databases FERPlus and RAF-DB,and the recognition accuracy was 82.43% and 84.35% respectively,which was only 2?3% lower than the original Xception model.The size of the Xception model is 239 M,and the size of the SE-Mini-Xception model is only 2.71 M,which greatly reduces the amount of model parameters.Experiments show that SE-Mini-Xception utilizes separable convolution and attention mechanisms,which greatly reduces the model parameter amount while the performance does not drop too much,and can be effectively applied to facial expression recognition.(2)For speech emotion recognition,in order to solve the problem that ordinary CNN cannot effectively deal with time series features,this paper introduces separable convolution and long short-term memory network(LSTM),and designs the Sep-CNN-LSTM model application For speech emotion recognition.The experiment is verified on the public speech emotion corpus RAVDESS.First,the original speech is detected by endpoint detection and filter denoising to obtain a valid speech segment,and then the features are extracted for speech emotion recognition.The 1D Sep-CNN-LSTM model trained with Mel Frequency Cepstrum Coefficient(MFCC)features and The using 2D Sep-CNN-LSTM model trained with spectrogram features,respectively,have achieved 90.77% and 82.21% recognition accuracy on the test set.Experiments show that the Sep-CNN-LSTM model can be effectively applied to speech emotion recognition.(3)Based on the SE-Mini-Xception model and 1D Sep-CNN-LSTM model proposed in this paper,a real-time facial expression recognition system and a speech emotion recognition system were designed and implemented respectively,and deployed in JETSON NANO.After testing,the performance of these two systems can meet the basic emotion recognition task.

Keywords/Search Tags:

Facial Expression Recognition, Speech Emotion Recognition, Separable Convolution, Attention Mechanism, LSTM

PDF Full Text Request

Related items

1	A Lightweight Real-time Facial Expression Recognizer Using Residual Attention Mechanism
2	Study On The Classroom Concentration Analysis Based On Facial Expression Recognition
3	Speech Emotion Recognition Based On Deep Learning
4	Bimodal Emotion Recognition Based On Facial Expression And Speech
5	Speech Emotion Recognition Based On Neural Network And Attention Mechanism
6	Design And Implementation Of Digital Facial Expression Real-time Recognition System
7	Research On Bimodal Emotion Recognition Based On Facial Expression And Speech Signal
8	Research On Multi-modal Emotion Recognition Method Combining Speech And Expression
9	Research On Emotion Recognition Based On Speech And Facial Expression
10	Facial Expression Recognition Algorithm Based On Deep Learning