Research On Speech Emotion Recognition Based On Improved Convolutional Neural Network

Posted on:2021-04-17

Degree:Master

Type:Thesis

Country:China

Candidate:Z G Xia

Full Text:PDF

GTID:2428330614959284

Subject:Industrial engineering

Abstract/Summary:

PDF Full Text Request

Speech emotion recognition is an important branch of artificial intelligence,which is generally considered as one of the important ways to realize human-computer intelligent interaction,and it has been widely used in intelligent dialogue systems,public opinion monitoring,service robots and other fields.In recent years,with the rapid development of deep learning technology,the application of deep learning to speech emotion recognition is currently a hot and effective research,especially the Convolution Neural Network(CNN)model which has quickly become one of the research emphases of speech emotion recognition models.However,there are still some problems in CNN that are worthy of research.First of all,in the study that uses CNN as the recognition model,the spectrogram is usually used as the input feature,but the spectrogram has the problem that the details are not obvious,which leads to a low recognition accuracy.Second,the CNN model will loss feature as the convolution layer deepen,which is the key to restricting the further improvement of the recognition rate.In response to these problems,the following research and experiments are carried out in this thesis.First of all,in the view of the problem that the details of the spectrogram are not obvious,this thesis designed a spectrogram texture feature extraction algorithm based on Log-Gabor and improved Local Binary Pattern(LBP).This algorithm first uses Log-Gabor enlarges the spectrogram detailed information on five scales and eight directions,and then use the improved LBP to extract texture features for the spectrogram of each direction and scale,and finally reconstruct the extracted texture features as the final features.At the same time,the extracted features are compared with Mel Frequency Cepstral Coefficient(MFCC)and Linear Predictive Cepstral Coefficient(LPCC).The experimental results proved that this method can effectively improve emotion recognition rate.Secondly,to address the problem of feature loss when CNN models deepen the convolutional layer,this thesis designed a multi-level residual Convolutional Neural Network.This network uses a residual structure that can span multi-level convolutional layers to compensate for missing features.Improve the network performance by making up for the original feature information,thereby improving the recognition rate.The experimental results proved that the model proposed in this thesis has better recognition rate,convergence speed and classification accuracy on the Emo DB dataset and CASIA dataset than the methods in the references.Finally,this thesis developed a speech emotion recognition system based on Jetson Nano host computer and intelligent service robot,and applied the Log-Gabor and improved LBP spectrogram feature algorithm and multilevel residual convolutional neural network to this system.The experimental results proved the superiority of this algorithm and the practicability of speech emotion recognition system.

Keywords/Search Tags:

speech emotion recognition, CNN, spectrogram, residual network, LBP texture feature

PDF Full Text Request

Related items

1	Research On Speech Emotion Recognition Technology Based On Deep Learning
2	Research On Emotion Recognition Technology Based On Speech Information
3	The Speech Emotion Recognition Research Based On Speech Spectrogram And Convolutional Neural Network
4	Research On Speech Emotion Recognition And Its Application In The Service Robot
5	Research On Texture Feature Extraction Of Spectrogram Image For Speech Emotion Recognition
6	Research And Implementation Of Speech Emotion Recognition Algorithm Based On Spectrogram
7	Research On Speech Emotion Recognition Based On Convolution Neural Network Feature Optimization
8	Speech Emotion Recognition Based On Spectrogram And Neural Network
9	Research On Key Technologies Of Speech Emotion Recognition
10	Research On Speech Emotion Recognition Based On The Two-layer CNN-LSTM