Font Size: a A A

Research On Speech Emotion Recognition Based On The Fusion Of ANN And GMM

Posted on:2017-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:J YuanFull Text:PDF
GTID:2348330491962754Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Human-computer interaction is an information exchange process that aims at completing clear defined roles using a certain way, it not only manifests the computer intelligence, but also allows the computer to serve humanity better. Emotion recognition of speech is crucial to the development of human-computer interaction. The study of speech emotion recognition is an interdisciplinary field of cognitive science, physiology, psychology, linguistics, computer science, it has brought increasing attention of the scientific research institutions and researchers both at domestic and foreign. Research of speech emotion recognition is developed focusing on artificial neural network and the gaussian mixture model. To further improve the accuracy of speech emotion recognition, this paper proposes the improved methods on the basis of the original structure model from the algorithmic level. At the end of this paper, a hybrid model of gaussian mixture model and neural network is proposed for speech emotion recognition. The main work and contributions of this paper are as follows:(1) This paper outlines the research background and significance of speech emotion recognition, summarizes the current research status at home and abroad, and describes the current theoretical and technical problems that need further study and solve.(2) This paper outlines the basic knowledge associated with the speech emotion recognition, including the definition of emotion and emotion classification. An emotional speech database has been established for experiments and it contains four kinds of emotions, including happy, anger, surprise and sadness. Then the voice signals in the database are preprocessed. After that, it briefly describes the methods of emotional characteristic parameter extraction and emotional feature vector normalization.(3) An Elman recurrent neural network based on GSA algorithm is proposed for speech emotion recognition. This algorithm optimizes the parameters of model by using the law of gravity to find the best position of the particle.(4) The EM optimization algorithm of gaussian mixture model (GMM) has some disadvantages, such as, local optimum. This paper studies an improved algorithm for GMM by setting an initial GMM model and using iterative methods to modified parameter M and GMM network.(5) A hybrid of GMM and Deep Belief Network is applied in speech emotion recognition. The dissertation constructs DBN based on Restricted Boltzmann Machines (RBM) model, then the combination of the GMM with multidimensional output and DBN is used in speech emotion recognition.
Keywords/Search Tags:speech emotion recognition, Elman recurrent neural network, Gravitational search algorithm, gaussian mixture model, deep belief network
PDF Full Text Request
Related items