Font Size: a A A

The Research Of Robust Speech Features Extraction Based On NPC And Improved MFCC

Posted on:2012-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:L HuFull Text:PDF
GTID:2248330395985720Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speech reognition has achieved satisfactory results in laboratory,however, whenapplied to the reality world, its recognition rate often declines drastically. How toimprove the robustness of speech recognition system, under different noiseenvironments, is one of the most important issues in the research on speechrecogniton. This paper mainly focused on the robustness of the front-end processingin speech recogntion—speech features extraction. Through analyzing the exsistedfeatures from both the time and frequency domain, and combining with thecharacteristics of the human voice and auditory properties, two kinds featureextraction methods were proposed which possess better robustness.Firstly, a new nonlinear feature extraction method which adopted accuracyartificial neural network instead of traditional linear prediction method was proposed.Using the theory of minimum mean squared error that is used by a linear predictionmethod, the num of to be estimated parameters which was very large in the artificialneural network reduces greatly and higher robustness is achieved. Extractionexperiments show that, in different noise level, the new feature has better robustnessthan Linear Predictive Coding (LPC) and Mel-Frequency Cepstral Coefficients(MFCC).Secondly, for the discrete cosine transform (DCT), which is a stage of featuresextraction for traditional MFCC, is defective in representing voice information, a newfeature extraction method using independent component analysis (ICA) method whichpossesses strong ability in representing speech characteristics instead of discretecosine transform was proposed. The test results show that, under the same signal tonoise ratio (SNR) level, the feature extracted method using the new method achievesbetter robustness than that of the traditional MFCC method. Further more, in order toreduce the insertion error in speech recognition, the Relative Spectra (RASTA)filtering technology proposed by Hermansky was introduced. The test results showthat, compared with the traditional MFCC and MFCC improved by ICAtransformation, the new method which combine ICA transformation with RASTAtechnique, not only reduces more insertion errors, but also has little impact on thewords recognition rate; Compared with PLP and PLP improved by RASTA filtering,the new method achieves higher words recognition rate with little decline in the insertion errors.
Keywords/Search Tags:speech feature extraction, nonlinear prediction, BP artificial neuralnetwork, independent component analysis transform, relative spectrafiltering
PDF Full Text Request
Related items