Speech Emotion Recognition Research Based On Deep Learning

Posted on:2021-12-09

Degree:Master

Type:Thesis

Country:China

Candidate:D Y Li

Full Text:PDF

GTID:2518306308472774

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

With the advancement of science and technology and the deepening of deep learning research,the application of speech emotion recognition in life has gradually become widespread.Most of the current speech emotion recognition tasks are regarded as simple classification tasks.First,a variety of acoustic features are manually extracted and feature engineering is built.Then,a classification network is trained by deep learning to recognize emotion categories.This thesis aims to explore the change and invariant information in speech emotion,based on the source-filter model of the speech generation model,explore the expression of speech emotion information and build a speech emotion space.Then build the mostsuitable network for learning speech emotion information based on the emotion space structure.Finally,the attention mechanism is used to optimize the model,and the key parts of the speech will be extracted and used.The main research contents of this thesis include the following parts.1.Study the emotion features of speech based on source-filter speech utterance modelAmong a large number of acoustic feature parameters,based on their physical meanings,feature parameters that reasonably express emotional invariance are screened and their effectiveness is verified.Through comparative analysis,the interference of the speaker,content and other information in the emotional speech is removed from the input level as much as possible,and the voice and voice emotional related information are retained.Finally,two types of speech emotion spaces are constructed:the global feature emotion space represented by the spectrogram,and the fundamental frequency,MFCC and their statistical values constitute the time series emotion space.2.Study the network structure suitable for speech emotion spaceBased on the speech emotion space composed of better speech parameters,a deep learning network suitable for mining speech emotion information is constructed.From the perspective of network structure modeling,this paper uses different convolutional neural networks according to the emotion space formed by the temporal characteristics of speech,and the emotion space of the spectrogram as input that can characterize both the discourse level and the frame level.Finally,two types of network structures,spectrum network and time series network,are constructed.In contrast,the combined convolutional spectrum network with the spectrogram as input obtains a better recognition rate.3.Study the extraction of key parts of speech emotion informationNot all parts of speech can reflect emotional information,and important types of speech emotional information are also diverse.How to extract and use these critical parts is a key point in research.This paper proposes a series of methods for extracting speech emotion importance in time,frequency domain and high-level space.One is to act on the input of the original spectrum,and the other is to learn and use local importance in the high-level space of the network.Through traditional methods,self-attention mechanism,non-random Dropout,and channel attention and its combination,the network model is optimized to improve the system accuracy on the basic network.Among them,the best effect is obtained by combining the self-attention method that is applied to the input of the original spectrogram with the attention of the high-level network.

Keywords/Search Tags:

speech emotion recognition, source-filter model, spectrogram networks, attention mechanism

PDF Full Text Request

Related items

1	Research On Speech Emotion Recognition Technology Based On Deep Learning
2	Research On Speech Emotion Recognition Model Based On Deep Neural Network
3	Research On Speech Emotion Recognition Based On The Two-layer CNN-LSTM
4	The Speech Emotion Recognition Research Based On Speech Spectrogram And Convolutional Neural Network
5	Research On Speech Emotion Recognition Based On Spectrogram And Statistical Features
6	Research And Implementation Of Speech Emotion Recognition Algorithm Based On Spectrogram
7	The Research Of Speech Emotion Recognition Algorithm Based On Imporoved Attention Mechanism
8	Research On Speech Emotion Recognition Based On Neural Network
9	Research On Several Key Technologies In Cross-corpus Speech Emotion Recognition
10	Speech Emotion Recognition Based On Attention Mechanism