Font Size: a A A

Research And Implementation Of Speech Emotion Recognition In Home Environment

Posted on:2021-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:K X TianFull Text:PDF
GTID:2518306476452644Subject:Pattern Recognition Theory and Applications
Abstract/Summary:PDF Full Text Request
With the development of science and technology,artificial intelligence has become one of the most popular scientific and technological topics,and more and more artificial intelligence products have moved from theoretical research to practical applications.Artificial intelligence has affected all aspects of our family life,from mobile phones to TVs,from speakers to refrigerators,from sweeping robots to service robots,people's lives have changed dramatically.As an important form of human-computer interaction,voice interaction is often used in smart homes.In order to enable smart home products to achieve more natural and efficient humancomputer interaction,so that they can feel and distinguish human emotions,the realization of speech emotion recognition in the home environment is of great significance.This thesis aims to make smart home products develop in a more humanized direction by conducting research on speech emotion recognition in the home environment.This thesis fully studies the key technologies of speech emotion recognition and analyzes the characteristics of voice in the home environment.Traditional machine learning algorithms and deep learning algorithms will be used to achieve speech emotion recognition in the home environment.The main research work is as follows:First,the MMSE-LSA front-end speech enhancement algorithm is optimized.The noise robust sub-band energy entropy ratio is used to detect speech endpoints,which better distinguishes speech segments from non-speech segments and establishes an initial noise model of non-speech segments;The time-recursive average noise spectrum estimation algorithm is adopted,and the posterior signal-to-noise ratio is innovatively used to calculate the probability of speaker's voice existence and different thresholds are used for harmonics in different frequency bands.The speech emotion classifier uses a support vector which performs well in machine learning.Experiments show that the optimized MMSE-LSA algorithm improves the voice quality and accuracy of speech emotion recognition on the whole.Then,taking into account the excellent feature extraction and classification capabilities of deep learning,a home environment speech emotion recognition model based on convolutional neural networks is proposed,and the impact of different data enhancement strategies on speech emotion recognition is discussed.By analyzing the application of convolutional neural networks in the field of images,this thesis proposes three models of convolutional neural networks with different attention strategies.The results show that the attention in the time dimension improves the accuracy of speech emotion recognition in the home environment.Finally,for the problem that the accuracy rate decreases in speaker-independent speech emotion recognition,this thesis introduces a speaker recognition module,which converts speaker-independent speech emotion recognition into speaker-dependent speech emotion recognition,and designs and implements a home environment speech emotion recognition software.The software uses a simple C / S architecture and is developed in python language to provide users with a simple and efficient graphical interactive interface.Through testing,the home environment speech emotion recognition software implemented in this thesis has good interactivity.
Keywords/Search Tags:Speech emotion recognition, Home environment, Speech enhancement, Attention-CNN, Speaker-dependent speech emotion recognition
PDF Full Text Request
Related items