Font Size: a A A

Research On Robust Speech Recognition In Complex Environments

Posted on:2019-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:T X GuoFull Text:PDF
GTID:2428330548967054Subject:digital media technology
Abstract/Summary:PDF Full Text Request
Speech interaction is one of the most important natural human-computer interaction methods,which is more and more close to the interaction between people.It is more and more widely used in people's life,work,study,entertainment and other fields.But the speech signal is more sensitive to the environment,and the real environment with all kinds of noise is complex and diverse,which makes the performance of speech recognition decline.Therefore,robust speech recognition technology plays an important role in the practical application of speech interactive systems in complex environments.Enhancing the recognition rate of speaker's speech content is the key research direction of robust speech recognition technology.Aiming at the robustness of speech recognition in complex environment,in this paper,based on the speech signal processing,feature extraction and model matching in speech recognition and the previous research,the robust speech recognition technology is explored.And several robust speech recognition algorithms are constructed and verified in the dialogue system.The main work of this paper is as follows:Firstly,this paper analyzes the development of robust speech recognition technology,classifies robust speech recognition technology,and analyzes robust speech recognition technology from three aspects:speech signal,feature and model.Two kinds of robust speech recognition technologies based on linear mapping hypothesis and nonlinear mapping assumption are summarized.In the aspect of robust speech recognition technology based on linear mapping hypothesis,a robust speech recognition algorithm based on mapping parameter estimation for feature mapping is proposed.The maximum likelihood estimation is used to estimate the maximum likelihood between the mapping feature and the training feature GMM to determine the gain matrix and offset matrix to obtain the new mapping feature.The general formula of robust speech recognition algorithm for feature mapping is derived by this method.The gain matrix and offset matrix are abstracted into a parameter matrix W,and the feature can be mapped by estimating the parameter matrix W.Experimental results show that the algorithm has a significant effect on the improvement of recognition rate.Based on the robust speech recognition algorithm based on KL divergence feature mapping,a robust speech recognition algorithm for feature mapping based on Bhattacharyya distance is proposed.A priori information is introduced to model the observed feature to get the GMM model.And minimized the Bhattacharyya distance between the GMM model and the training feature GMM model to estimate the parameter W.Experiments show that the algorithm is an effective method to enhance the robustness of speech.In the aspect of robust speech recognition technology based on nonlinear mapping hypothesis,the robust speech recognition technology based on deep regression network is explored,and its basic framework is analyzed.Taking the deep network as a regression model to learn automatically the complex relationship between the observed feature and the clean feature,and the complex features are reconstructed to get approximately clean features.In addition,designed and developmented a speech dialogue system for complex environment,the system combines robust speech recognition technology research in this paper.The operation results show that the system can improve the recognition rate.And the speech interaction is more friendly.Finally,the paper summarizes the research work,and points out the shortcomings and further research work.
Keywords/Search Tags:speech interaction, robust speech recognition, feature mapping, KL divergence, Bhattacharyya distance, deep regression network
PDF Full Text Request
Related items