Speech emotion recognition is one of the key technologies of human-computer interaction that has been widely used in the third industry,criminal security,communication,biomedicine,education,industry and many other fields.The major problem in the field of speech emotion recognition is to find a classifier model that can effectively identify speech emotion and extract feature parameters that can effectively represent speech emotion.Based on the analysis of traditional speech emotion recognition model,this paper proposes a speech emotion recognition model based on fusion.The speech emotion recognition model based on fusion has higher average recognition rate than the traditional speech emotion recognition model.The speech emotion recognition system based on fusion can be embedded as a subsystem into the developed machine pet that can help to prevent and treat psychological diseases such as depression.The main contents of this paper are as follows:1.we make an in-depth,objective and comprehensive review and comparison of typical emotional speech databases at home and abroad,and then determine the CASIA speech that has been recorded by the Institute of automation,Chinese Academy of Sciences which is suitable for fusion algorithm.Furthermore,we determime to study four basic emotional state speech: surprise,calm,sadness and anger in this paper.2.Some pre-treated methods are summarized in this paper,such as sampling,quantization,pre-emphasis,framing,windowing and so on.At last,we use a two-level discrimination endpoint detection algorithm in order to overcome the problem of endpoint effect of speech signals.3.The changes of short-term energy,short-term zero-crossing rate,fundamental frequency,MFCC and resonance peak in four basic emotional of surprise,calm,sadness and anger are summarized.A fusion speech emotion recognition model based on eight characteristic parameters of speech signal,i.e.first formant change rate,first formant maximum value,first formant average value,local pole number,pronunciation duration,energy average value,energy maximum value and fundamental frequency average value,is also proposed.Experimental results show that the performance of the fusion speech emotion recognition model is higher than that of the single recognition model. |