Font Size: a A A

Study On MFCC And Lasso Reverberation Suppression Of Feature Extraction Algorithm Of Speech Recognition

Posted on:2016-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:X W ZhangFull Text:PDF
GTID:2348330461480194Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The increase of communication requirements of human and computer promotes the progress of speech recognition technology. Speech recognition is based on the analysis of algorithm and probability, and does not require special knowledge of signal processing. Speech recognition involves many fields, such as linguistics, signal processing, computer science, pattern recognition, communication and information theory, physics, psychology, machine learning, deep neural network and so on. As a research focus of high technology, speech recognition technology has maken great progress. Study on the extraction of speech feature parameter plays an important role in the development of speech recognition technology and affects recognition rate and robustness of speech system. Feature extraction algorithms have problems, such as the problem that spectrum leakage affects the precision of feature parameters, the problem that reverberation influences the precision of received signal of microphones and so on, so the study on feature extraction algorithms of speech recognition is particularly important. The paper aims for the optimization of feature parameters of speech recognition, MFCC algorithm based on the improved Kaiser window and Lasso reverberation suppression algorithm based on the acoustic model of deep neural network are proposed, which effectively optimize the feature extraction parameters and improve the performance of speech recognition system.For the problem that spectrum leakage of Kaiser window affects the precision of MFCC feature parameters, MFCC feature parameter extraction algorithm based on the improved Kaiser window is put forward to. The improved Kaiser window, where is at the frequencies that side lobe peaks of Kaiser window correspond to. introduces weighted impact function, solving the conflicting problem of main lobe width and side lobe amplitude and reducing spectrum leakage. Theoretical analysis shows that main lobe width of Kaiser window remains the same, amplitude attenuation of side lobe increases. The experimental results show that speech recognition rate of MFCC feature parameter extraction algorithm based on the improved Kaiser window is better than speech recognition rate of the original algorithm.In view of the problem that reverberation and background noise influence the precision of received signal of microphones. Lasso reverberation suppression algorithm based on the acoustic model of deep neural network is put forward to. The algorithm uses Lasso model to calculate the parameters of the sparse linear prediction and evaluate the late reverberation, then the spectrum of the reverberated signal subtract the spectrum of the late reverberation, arriving the signal of reverberation suppression. Due to the analysis of spectrum, the spectrum of reverberation suppression based on Lasso model compared with the spectrum of the reverberated signal is more clear. In the experiment, the Lasso algorithm is applied to speech recognition system based on the acoustic model of deep neural network, the word error rate of Lasso reverberation suppression algorithm is better than the word error rate of the reverberated signal. Experimental results show that regardless of the use of artificial simulated data or the use of real data, the recognition performance based on the acoustic model of deep neural network of element, frames and utterance is all improved. The computation time of the method of Lasso reverberation suppression based on utterance is about half of the computing time of the other two methods, and suitable for the application of speech recognition system based on real data. Lasso reverberation suppression method is not only suitable for static data, but also applicable to dynamic data.
Keywords/Search Tags:Speech recognition, MFCC algorithm, Speech recognition rate, Sparse linear prediction, Lasso model, Reverberation suppression
PDF Full Text Request
Related items