Font Size: a A A

The Comparison And Analysis Of The Feature Extraction Algorithm Of Voiceprint Recognition System

Posted on:2017-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2348330488463477Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Voiceprint recognition with iris recognition, face recognition, fingerprint recognition belongs to the biological recognition, and they all use the personal biological characteristics for identification. Because of the economy and convenience of the voiceprint recognition, the voiceprint recognition has a very broad prospective development. The difference between the voiceprint recognition and voice recognition is that the former recognizes person through the characteristic of the voice, and the latter is through the content of the speaker. As the voice characteristic are different, unique, and hard to be imitated from person to person, using the voiceprint to recognize the identity of the person is more precise than other biological recognition methods. The process of the voiceprint recognition is to pick up a voice from the speaker's voice signal, then extracting the characteristic parameters of the individual from the voice, and finally to affirm the identity of the speaker.The assay mainly discusses the voiceprint recognition system which is not related to the content. Not related content means that when we extract voice, the speaker can put in any voice instead of the fixed voice sample. In accordance to the existing circumstance of research for the voiceprint recognition system, the assay emphasizes the method of extracting the characteristic parameters in the voiceprint recognition system. Nowadays there are several common sound tract parameters including LPC, LPCC, and MFCC. We can receive the MFCC_D parameter from the first difference of MFCC which has a better recognition rate than MFCC parameter in the research. In study of the Mel filter group, we found that there is a lower precision at the high frequency point, so we propose an MFCC extracting method based on the mixed group of Mel filter, which means that we can get a precision-calculated filter group to get an improved recognition rate at the high frequency point after we combine the overturned Mel filter group with the Mel filter group. We know that the speech signal is composed of channel response and characteristics of the glottis, while MFCC parameters can well reflect the channel characteristics, and pitch can well reflect the characteristics of the glottis, therefore, so we propose a parameter that can well reflect the characteristics of the channel characteristics and the characteristics of the glottis, namely MFCC parameters based on pitch. There are many method extracting pitch, including short-time autocorrelation function(ACF), and short time average magnitude difference function(AMDF). According to the ACF and AMDF, we propose a weighted method of ACF, namely the short time average magnitude difference function after inverse square first, and then weighted to the short-time autocorrelation function, weighted actually multiply up. In the study of the mentioned calculation method, we can get the pitch period. Different methods cause different pitch period, so after combining with the MFCC parameters, we can get a different parameter.The assay studies recognition method in the voice pre-processing and voiceprint recognition. The pre-processing includes sampling, quantization, pre-emphasizing, windowing, framing and endpoint detection. Of which the endpoint detection is very important. It will estimate the starting point of the sound signal, and distinguish the sound signal and non-sound signal. Effective endpoint detection technology can not only detect the starting point of the sound signal quickly to save the time, but also exclude the unrelated sound voice to improve the recognition characteristic. The assay adopts the double thresholds endpoint detective method, namely combining the short-time energy method and average zero-crossing rate method to estimate the starting point of the signal better.We use the Gaussian mixed model in the recognition method, which is also used in the voiceprint recognition system which is not related to the voice content. The model introduces the estimate of the GMM model parameter adopting the EM arithmetic, the initialization of the GMM parameter adopting K-means arithmetic and training process of the GMM model.The assay mainly studies the method of extracting the characteristic parameter, using the Gaussian mixed model to analysis and compare different arithmetic about the characteristic parameters through the testing, which shows that the new proposed extracting characteristic parameter method improves the recognition rate in some way to achieve improve the recognition rate in the voiceprint recognition system. The assay has not come down to the voiceprint recognition technology in the noise environment, however, in the future, we will study the de-noise processing in the voiceprint recognition system.
Keywords/Search Tags:Voiceprint recognition, feature extraction, MFCC, Pitch based MFCC, Improved MFCC
PDF Full Text Request
Related items