Font Size: a A A

Research And Implementation Of Speech Emotion Recognition Algorithm Based On Spectrogram

Posted on:2022-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:L J YangFull Text:PDF
GTID:2518306575966889Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As an information transmission medium containing human language and expressing emotions,voice signals are an essential means for people to obtain and disseminate information,as well as one of the key factors for human-computer interaction,and play a extremely important role in data support for speech emotion recognition research.At present,the research technology of speech recognition that converts speech signals into text output is very mature and has been widely used in life.However,there are still some limitations in the commercial application of speech emotion recognition,which are mainly reflected in the data.The effectiveness of the collection method and data characteristics.The time domain and frequency domain characteristics of the spectrogram can better depict the characteristics of a segment of speech.At the same time,the function of deep learning methods for model training has better feature performance than traditional machine learning methods to a certain extent.Therefore,this thesis uses the two-dimensional spectrogram as the data entry point to carry out speech emotion recognition research,selects relevant algorithms for model testing and improves the model structure.The modified model can better improve the recognition accuracy of the model.For this topic selection,the following tasks have been mainly completed:1.Introduced the research significance and research background related to the speech emotion recognition method,as well as the current domestic and foreign research status of this topic and the existing problems in the current research work,and then introduced the main work content of this thesis and the organization structure of the thesis in detail.2.The algorithm model to a Le Net-5 network model as the research basis,by adding two layers of convolution and pooling layer,and modify the size of the convolution kernel,using the L2 regularization to prevent the model from overfitting.The spectrogram is experimental data input to build a network model of the characteristics of learning and training model,at last,and it makes speech emotion classifications through the connection layer.Experimental results show that this network model achieves better classification results on public datasets,and its accuracy rate reaches 74.6%.3.Although the convolutional neural network model has great characteristics of learning capabilities and higher classification accuracy,but when a single network learning characteristic,its capacity is limited.Therefore,in order to improve the model to the characteristics of emotional expression ability,on the basis of the third chapter,this thesis proposes a network model based on the combination of convolutional neural network and attention mechanism is proposed.The spectrogram is used as input data for model training and Emotion recognition.The sentiment classification task was effectively realized on the EMO-DB and CASIA corpus,and the classification accuracy rate was76.9%.
Keywords/Search Tags:speech emotion recognition, Spectrogram, CNN, Attention
PDF Full Text Request
Related items