Research On Texture Feature Extraction Of Spectrogram Image For Speech Emotion Recognition

Posted on:2019-05-09

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Liu

Full Text:PDF

GTID:2428330548485902

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

At present,the commonly used speech emotion features mainly include quality features,metrics features and spectral features,which all focused on the time domain or frequency domain of the voice,but rarely considered the time-frequency related features,making the extracted features incomplete.It is possible to study the time-frequency correlation of speech when the speech spectrum can be well connected.Based on this,this thesis studies the method of extracting texture feature from speech spectrums,as the following two aspects shows:1)In order to deal with the problem that the original Complete Local Binary Pattern(CLBP)feature has high dimension and relies too much on the central pixel in the absence of the central pixel point,this thesis builds the Uniform CLBP_Sign(UCLBP_S)and the Improved CLBP_Magnitude(ICLBP_M).At the same time,this thesis proposes a power exponential weighted fusion method based on the problem that classic decision-level weighted voting fusion method cannot play its part when the classifiers' recognition performances are generally the same.Firstly this thesis transforms the original speech sample into a spectrum diagram,then use the multi-scale,multi-directional log-gabor filter to deal with the language spectrum,in order to enlarge the detail information of the image.Then the block histogram features of the UCLBP_S feature and ICLBP_M feature are extracted and cascaded as a new fusion feature called ICLBP_S_M.Finally,based on SVM,these three features are weighted by the decision level in order to achieve the speech emotion recognition.2)The More Direction Weber Local Descriptor(MDWLD)operator is constructed for the deficiency of the Weber Local Descriptor(WLD)operator in the representation of the gradient change information of the diagonal direction.At the same time,the Complete Gradient Center-Symmetric Local Directional Pattern(CGCS-LDP)is constructed for the amplitude information that the Gradient Center-Symmetric Local Directional Pattern(GCS-LDP)cannot represent the change of the edge response value of the image gradient.In order to make up for the lack of a single texture feature representing of image texture information,on the basis of the acquisition of the log-gabor spectrum,this thesis extracts ICLBP_S_M features,MDWLD feature and fusion feature called CGCS-LDP,and the decision-making level of the three characteristics of decision level fusion,and these three features are weighted by the decision level in order to achieve the speech emotion recognition.The experimental results show that this method can enhance the performance of speech emotion recognition effectively.

Keywords/Search Tags:

Speech Emotion Recognition, Improved Complete Local Binary Pattern, Power Exponential Class Weighted Fusion, More Direction Weber Local Descriptor, Complete Gradient Center-Symmetric Local Directional Pattern

PDF Full Text Request

Related items

1	Face Recognition Research Based On Improved Local Directional Pattern
2	Based On Local Binary Pattern And Weber Local Descriptor Face Recognition
3	Research On Speech Emotion Recognition Technology Based On Nonlinear Feature And Spectral Feature Extraction
4	Study Of Facial Expression Recognition Based On Partial Occlusion
5	Face Recognition Based On Improved Center-symmetric Local Binary Pattern
6	Face Recognition Research Based On Local Binary And Direction Pattern
7	Emotion Recognition Research Based On Bimodal Information Fusion
8	Face Recognition Based On Intensity And Gradient Local Directional Pattern
9	Research On Facial Expression Recognition Method Based On Multiple Features
10	A Study Of Local Ternary Pattern In Illumination Face Recognition