Font Size: a A A

Research On Speech Emotion Recognition Based On Fractional Fourier Transform

Posted on:2024-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:L R HuangFull Text:PDF
GTID:2568307172481374Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Since speech emotion recognition can play a role in the field of voice customer service,medical field,young education field,and intelligent driving field,speech emotion recognition has a good development prospect and necessary.Using computers to analyze emotions and extracting speech emotion features that enable high recognition rates is an important step.Most of the current processing for speech emotion recognition features is based on the Fourier transform(FT),and speech signals are timevarying signals,so they are usually set to be short-time smooth,which makes it necessary to balance the time resolution and frequency resolution when studying.Therefore,it is necessary to study a more applicable time-frequency analysis tool for speech signals.To address these problems of speech emotion recognition,this paper proposes to apply the fractional Fourier transform to the study of speech emotion recognition,and the innovation points and main research work are as follows:(1)Since speech signals are time-varying signals,this paper introduces a new time-frequency transform tool-the fractional Fourier transform(FrFT),which can transform the signal into any intermediate domain between time domain and frequency domain,solving the problem that Fourier transform cannot locate time and frequency,and can get the time information of a frequency.It also has an order,which makes the algorithm more flexible.(2)Fuzzy function(AF)is proposed to find the optimal order of FrFT,which can obtain the optimal order of each frame with less computation and more accurately.(3)FrFT is applied to extract MFCC speech features,improving the performance of MFCC speech features.The application of FrFT-MFCC in LSTM neural network improves the accuracy of speech emotion recognition.(4)As a traditional acoustic feature,MFCC features can express limited time information.Therefore,we propose to use FrFT to extract speech spectrum and apply it in CNN-LSTM neural network,so as to further improve the accuracy of speech emotion recognition.
Keywords/Search Tags:feature extraction, the Fractional Fourier Transform, ambiguity function, MFCC, speech spectrogram
PDF Full Text Request
Related items