Research On Speech Emotion Recognition Method Based On Hybrid Neural Network

Posted on:2022-07-11

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Yu

Full Text:PDF

GTID:2518306548961159

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

With the wide application of man-machine interaction in People's Daily life,the research of speech signal direction becomes more and more important,and speech signals contain more than just basic semantic information,there are also implicit emotional states.Speech emotion recognition,as one of the key technologies of speech signal processing,has been paid more and more attention by researchers.In order to improve the accuracy of speech emotion recognition and solve the complexity of speech emotion features,the deep learning model is improved.The main work and innovations of this paper are as follows:(1)For long short-term memory network in the current cell state calculation,easy to lose the previous sequence of feature information,therefore,the long short-term memory network is improved.Link the previous unit state to the forgetting gate and the input gate as peepholes,adds the state of the previous unit to the calculation of the gated unit,to ensure the integrity of the current information state,and integrate with the self-attention mechanism.(2)For the multi-head attention mechanism,after low-dimensional projection of speech features,the calculated total parameter will be different from the total value of the joint distribution in the attention mechanism,and it is difficult to approximate the total value of the joint distribution,thereby affecting the subsequent model expression.Therefore,the multi-head attention mechanism is improved,and the low-rank distribution and similarity of each attention are superimposed and calculated,connect each of the originally independent sub-attention mechanisms,and then normalize them to calculate the final feature expression.(3)In view of the fact that a single deep learning model cannot accurately identify the emotional categories of speech.In this paper,a two-channel network model is proposed,in which convolutional neural network is used to extract spatial features from spectral images and bidirectional long short-term memory network is used to extract temporal features.And in order to further extract the feature vectors of high importance and fuse the features,using the improved multi-head attention mechanism method,the feature extraction results of the two-channel model were calculated by attention,complete connection operation is carried out,and classification output is carried out by classification layer.In order to verify the effect of the algorithm,the two models proposed in this paper will be compared and tested on the EMO-DB and IEMOCAP data sets.In the paper,the traditional neural network model and the existing model with better effects are selected as the baseline model for the comparison experiment,and the ablation experiment is used to verify the effectiveness of the innovation point.The experimental results show that the speech emotion recognition model proposed in this paper has achieved better accuracy than the comparison model on the two data sets,and the recognition effect in the ablation experiment is better than the comparison model,verifies the effectiveness of the innovation points,and proves that the model proposed in this paper can achieve better results in speech emotion recognition.

Keywords/Search Tags:

CNN, RNN, Speech emotion recognition, Attentional mechanism, Deep learning

PDF Full Text Request

Related items

1	Speech Emotion Recognition Based On Deep Learning Technology
2	Research On Speech Emotion Recognition Based On Deep Learning
3	Design And Implementation Of Speech Emotion Recognition Algorithm Based On Deep Learning
4	Research On Speech Emotion Recognition Method Based On Time Series Deep Learning Model
5	Research On Speech Emotion Recognition Technology Based On Deep Learning
6	Research On Feature Fusion Method Of Speech Emotion Recognition Based On Deep Learning
7	Deep Learning Models For Speech Emotion Recognition
8	Speech Emotion Recognition Modeling Research Based On Deep Learning
9	A Comparative Study Of Speech And Text In Emotion Recognition
10	Speech Emotion Recognition Research Based On Deep Learning