Speech Emotion Recognition Modeling Research Based On Deep Learning

Posted on:2020-01-26

Degree:Master

Type:Thesis

Country:China

Candidate:W He

Full Text:PDF

GTID:2428330575956408

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the development of computer technology and the popularization of artificial intelligence,the speech emotion recognition has received extensive Attention from the academic and industrial circles.At present,most of the emotion recognition tasks use the method of manually extracting various acoustic features and reducing physical dimensions,constructing feature engineering,and improving the recognition results.This paper aims to explore the expression of emotion information in speech,understand the change and invariance of emotion information in speech,extract the essential characteristics of emotion from speech,and build the most suitable network structure to represent emotion information.Based on the above research emphases,this paper includes the following parts:1.Study the emotion recognition network based on traditional speech featuresIn a large number of acoustic characteristics,statistical analysis of the existing data to filter out the acoustic characteristics and statistical characteristics,build an effective and complete emotional characteristics of the project.From the physical sense,we screen the reasonable characteristics of emotion expression and verify their effectiveness.From the perspective of mathematical statistics,the chi-square test is used to select features,remove redundant information of feature sets,improve network training efficiency,and construct complete feature engineering.2.Study a deep learning emotion recognition network based on speech spectrogramThe speech spectrum contains almost all the speech features.The two-dimensional spectrum structure can not only reflect the characteristics of harmonic and other excitation sources,but also analyze the channel characteristics such as cepstrum and formant.Deep neural networks introduce nonlinear information and have the advantage of self-learning input data characteristics.The deep learning emotion recognition network based on speech spectrogram was built,the ResNet network with local perception and skip connection was selected,and the improvement was made based on convolution kernel.On this basis,the ResNet-LSTM network was built to conduct time-series modeling of the high-level emotional characteristics learned from the ResNet network.3.Attention mechanism is introduced to study the feature fusion of low-level descriptors and high-level semantic informationThe set of traditional speech features that can represent emotional information is integrated with the high-level semantic information of speech signals learned by ResNet-LSTM network.The fused features are classified and output by DNN network to increase the explanatory power and artificial assistance of deep learning.In addition,Attention mechanism is introduced to explore the key frame information in speech.The learned Attention is added as a weight coefficient to the artificially extracted low-level descriptor features and applied to the feature fusion experiment.

Keywords/Search Tags:

emotion recognition, emotion feature, set deep learning, Attention mechanism

PDF Full Text Request

Related items

1	Research On Deep Learning Emotion Recognition Method Based On Attention Mechanism
2	Multi-modal Emotion Recognition Based On Deep Learning
3	Research On Feature Fusion Method Of Speech Emotion Recognition Based On Deep Learning
4	Speech And Facial Double Model Emotion Recognition
5	Research On Emotion Recognition Based On Speech And Facial Expression
6	A Comparative Study Of Speech And Text In Emotion Recognition
7	Research On Speech Emotion Recognition Based On Deep Learning
8	Speech Emotion Recognition Based On Multi-feature Combination And Attention Mechanism
9	Application Research Of Mixed Feature Speech Emotion Analysis Based On Deep Learning
10	Research On Facial Emotion Recognition Based On Deep Learning