Research On Emotion Recognition In Speech Based On Cepstral Distance Feature And CNN

Posted on:2017-02-09

Degree:Master

Type:Thesis

Country:China

Candidate:B Zhang

Full Text:PDF

GTID:2308330485962242

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Iterative development and the continuation in human intelligence is the basic point of artificial intelligence (AI), and is the most meaningful thing for humanity. On "smart" part in artificial intelligence, the machine in some ways is even more intelligent than human which is due to a lot of research. However, on emotion part the machine is even not better than two years old child. Thus, emotion is a sign of robot revolution, and is the sticking point when the robot entering society. In human-computer interaction, the voice is undoubtedly the most natural way to interact. To acquire emotional information in speech and apply this information in making decisions are the key in affective computing. For emotion recognition in voice, features and models are applied directly from the speech recognition or pattern recognition so its effect for processing multi emotion is very limited and it cannot reach the level of the actual applications in robots. This dissertation enhance the ability to identify emotions in speech on speech features and speech models, optimization, etc. And the results are applied on humanoid robot platform which not only gives smart but also motion, mind to the robot.The main work of this dissertation are:(1) for the problem that traditional voice features can only characterize voice signal itself, a new features act as characterization of the properties of emotional speech along with classic features, classification strategy in machine learning are comprehensive integration and used in emotion recognition in speech. (2) For the facts that the modeling capability of classical S VM (support vector machine) for multi-classification problem is weak, this dissertation studies the modeling ability for multi-classification problem (not limited to voice signal).As for the imbalance problem in multiclass classification, a two-stage classification scheme is proposed and it solve the problem well. The final result is very competitive in domestic and foreign studies. (3) After summarizing the studies in (1) and (2), I realize that the work rely too much on prior experience. An adaptive, self-learning model to get emotion features is needed. So optimal feature acquired after many experiments is integrated with CNN (Convolution Neural Network) model in deep learning obtain the best rate in the present study. Finally the use of ADA-DELTA algorithm increase convergence rate.

Keywords/Search Tags:

Artificial Intelligence, Machine Learning, Speech Recognition, Affective Computing, Optimization

PDF Full Text Request

Related items

1	The Research And Application Of Speech Affective Computing
2	Study On Related Technology Of Artifical Psychology-Facial Expression Recognition And Affective Modelling
3	Research And Implementation Of Speech Emotion Recognition Based On Feature Selection And Confusion
4	Research On Intelligent Recognition Method Of Static Facial Expression
5	Research On Speech Recognition Method Of Depth Learning Based On Artificial Intelligence
6	Application And Research On Speech Recognition Technologies In Security Monitoring System
7	Affective And Behavioral Computing For Personalized E-Learning
8	Research On Speech Emotion Recognition Methods
9	Application Research Of Offline Handwritten Digit Recognition Based On Artificial Intelligence
10	Application And Research Of Extreme Learning Machine In Speech Emotion Recognition