Font Size: a A A

A Study On Nonlinear Feature Extraction And Feature Optimization Of Emotional Speech

Posted on:2019-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:C X SongFull Text:PDF
GTID:2348330569479534Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Speech is one of the most easily accessible emotional information carriers,it carries and communicates a multilevel information source.In daily communication,people use language to convey a lot of information,it not only contains the semantic information of the language itself,but also contains the expression of the emotional state of the speaker at that time.Therefore,the perception of emotion in social interaction is very important to infer the emotional state and intention of the other party.Emotional speech recognition as an effective means of emotional calculation,it is based on the phonation principle of speech signal,by extracting the effective emotional feature parameters of the speech signal,the computer can correctly correlate and map these feature parameters with the emotional state in the way as close as possible in accordance with human beings,so as to achieve emotional state judgment of a new technology.Based on the research background and significance of emotional speech recognition,this paper introduces the research trends and shortcomings of emotional speech recognition and nonlinear features in detail.In view of the incomprehensiveness of the feature parameters in the current emotional speech recognition,the geometrical features of emotional speechspace reconstruction and the feature optimization method based on dimension space were proposed.The research contents of the paper are as follows:(1)The basic composition of emotional speech recognition system was introduced.Theoretical introduction and experimental simulation of pretreatment operation and extraction of typical traditional emotion acoustic features(prosodic feature,phonetic feature and MFCC)were carried out.The emotional speech recognition system based on acoustic features was built for the EMO-DB emotional speech library as the experimental data and support vector machine as the recognition model.(2)Based on the non-linearity of speech signal generation mechanism,the nonlinear time series analysis of emotional speech signals was carried out.That is,the phase space reconstruction technique is used to map the one-dimensional emotional speech signals of different emotional states of the same semantic into the higher-dimensional space.The non-linear generation mechanism of emotional speech signal is verified by the difference in high-dimensional space,which provides the experimental basis for extracting the non-linear features of emotional speech signal system in the next step.In order to analyze the state variables of emotional speech signal system from a macro perspective,the nonlinear attribute features based on phase space reconstruction are extracted in this paper.And according to the different performance of these characteristic parameters in the same semantic different emotional states,their ability to distinguish emotional state is validated.(3)Starting from the nonlinear mechanism of emotional speech signal,by analyzing the geometric index of the attractor skeleton structure in phase space from the microscopic perspective,the nonlinear geometric features based on phase space reconstruction(five kinds of descriptor contours based on trajectory)were extracted.The relationship between five descriptor contours based on trajectory and emotional states is analyzed qualitatively,which verifies that descriptors can be used as effective new features to distinguish emotional states.Five kinds of emotional speech recognition are carried out on prosody,acoustics,MFCC,nonlinear attribute and nonlinear geometry,respectively,which verify the superiority of nonlinear attribute features and non-linear geometric features in distinguishing emotional state.(4)Based on the feature distribution of non-linear attribute features and non-linear geometric features in the emotional dimension space,a feature optimization method based on dimensional space model was proposed.Firstly,the feasibility of this method is verified by designing the feature optimization pre-experiment based on nonlinear global features.And then,three groups of emotional speech recognition experiments based on feature level fusion,feature selection and feature optimization are carried out on complete set of emotional features composed of prosodic,acoustic,MFCC,nonlinear attribute and nonlinear geometric features,which proved that the feature parameters after optimization can effectively improve the recognition performance of the network,and then verify the effectiveness and applicability of the method.
Keywords/Search Tags:emotional speech recognition, phase space reconstruction, nonlinear attribute features, nonlinear geometric features, feature optimization
PDF Full Text Request
Related items