Font Size: a A A

The Research Of Speaker Recognition Under Noisy Environment

Posted on:2015-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:G W ChengFull Text:PDF
GTID:2348330518470296Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
As a primary means of communication in daily life, speech has become the main source to information and conveys a speaker's unique message with which our lives would be facilitated to a greater extent through full and effective utilization. Therefore, on basis of this,Speaker Recognition Technology, by extracting and analyzing the characteristics of parameters, tries to target at achieving the purpose of identification.Firstly, this paper focuses on the digitized extraction of the speaker's voices, and then preprocesses them through quantification, pre-emphasis, framing and windowing. And further,the endpoint detections of acquired voice signals have been under investigation so that it could contribute to searching for sound fragment. As for the enhancement of the speaker's speech within a background noise, we tend to apply the discrete linear Kalman filtering,which is an improvement for better performance on the traditional filtering methods grounded on linear prediction coefficients.Secondly,in order to analyze and extract the details featuring an individual's character,apart from the processions as AMDF for pitch extraction, Cepstrum for extractions of the first,second and third formant, and the linear analysis for extractions of linear prediction coefficients (LPC ) and linear predictive cepstral (LPCC) within time domain, we attempt to extract the Mel cepstral coefficients (MFCC) and its first difference basing on human auditory characteristics via the frequency domain. And in addition to those previous algorithms on extraction, part of improvement has been made.Last but not the least, the Gaussian mixture model (GMM) based on single Gaussian density function has been created to achieve the speaker recognition.Via K-means algorithm,the initial parameters of model have been estimated, and meanwhile, the initial cluster centers have also been fixed by analysis of variance. And by adoption of the triangle rule and the fix of clustering points within centers, the clustering has been accelerated and during which the effects of isolated points in clustering also involved. And after this, under the confirmation of Gaussian mixture number, we tend to optimize the parameters of vector sequences through high-variance method.Finally, this experiment chooses two types of platforms, the computer the hardware one,and windows 7 the software one. And during this course, Matlab processing contains those modules as voice acquisition, preprocessing, speech enhancement, extraction of feature parameters, parameter training and recognition. Throughout the whole research, it covers the effects of the selection of various undetermined parameters on the recognition rate and demonstrates them on graphs clearly. And by virtue of those graphs, when the test voice lasts 5s, MFCC, ?MFCC and ?LPCC working together as identification parameters, the GMM mixed number is 32, the minimum threshold value of covariance is 0.1, we can successfully get higher recognition rate. At the same time, for the algorithm improvements made in this paper, the experiment results manifest that the recognition rate has reached certain increase.
Keywords/Search Tags:Speaker Recognition, Speech Enhancement, Feature Extraction, Gaussian Mixture Model (GMM)
PDF Full Text Request
Related items