Font Size: a A A

Research On Voiceprint Recognition Robust Technology And Its Application

Posted on:2016-10-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:1108330482955265Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
By comparing with the voiceprint from the speaker or enrolled in the database, the voiceprint recognition technology checks and identifies the user’s status and determines whether the speaker is the target people or not. As the most economical, reliable, convenient and safe way of identity, the security that voiceprint recognition provided can rival other biometric technologies (fingerprints, palm and iris), and need no special equipment but a phone or a microphone, and the data acquisition is extremely convenient with low cost. At present the technology got great development prospects. But the low recognition rate and limited real-time character problem in the environment with bad background noise restrict its practical application, so the key to a voiceprint recognition technology is to improve the system robustness and real-time performance.Voiceprint recognition system is mainly composed of speech signal preprocessing and endpoint detection and feature parameter extraction as well as the training and matching of the voiceprint model. In order to improve the robustness of the system and real time, the paper deeply studied the technology or algorithm afore mentioned, and the idea is further tested by two application systems. The experimental results show that the proposed algorithm is effective. The paper is divided into six chapters. The first chapter is an overview of voiceprint recognition which described the current research status of implementation technology or algorithm. Starting from the second chapter, the main research content spread successively with five parts, chapter two, chapter three and chapter four mainly study the improvement of the system robustness, chapter five studies the improvement of system real-time performance, while chapter six gives the two application system developed by the author, in which the technology mentioned in the paper is applied, and the robustness and the real-time performance of the system are verified.In the second chapter, the speech pretreatment technology and endpoint detection algorithm in noise background were studied, in view of the intuitive difference between the speech and noise in spectrogram, this paper puts forward the endpoint detection methods based spectrogram. The technical difficulty of the endpoint detection of the Spectrogram is how to express the visual differences. According to the characteristic of the autocorrelation, the paper chooses the autocorrelation function to describe the difference. The column autocorrelation function of spectrogram can describe the difference obviously. By the autocorrelation function distribution of speech and noise spectra, the cut-off point of distinction between speech and noise can be found, and this point can be taken as the endpoint detection threshold of speech with noise. Because paper uses a broadband spectrogram, frequency resolution will be somewhat less, so after row autocorrelation spectrogram detection, speech column still have residual noise. In order to further remove the noise in different frequency, the paper combined multi-resolution of empirical mode decomposition analyzed the speech with noise and decomposed the speech into different frequency scale before column autocorrelation spectra analysis, the experiments proved that the effect is ideal.In the third chapter, the extraction of speech feature parameter was studied. For the voiceprint recognition system the most ideal speech feature parameters does not reflect the semantic information but the characteristics of the speaker, and owns a small data amount. Experiments show that the voice is the excitation signal from the sound source through the resonance of soundtrack and radiated by the mouth and nose. The speaker identity information was reflected by the characteristics of both glottis and the channel. So the paper proposed to combine channel characteristics and characteristics of the glottis, so as to make good distinguish between speakers. Through the comparison analysis of common channel characteristics and glottis characteristics, the cepstrum coefficient MFCC was selected to represent a channel characteristics and the pitch represent glottis features, and combined the two characteristic parameters, the concreted combination ways is:the center frequency of each filter contained in Mel filter MFCC triangular filter group is no longer fixed, but according to pitch frequency of corresponding point in the actual frequency domain, the number of filter is also a dynamic, this feature is called MFCC feature parameters based on pitch cycle.。In order to further improve the recognition rate of voiceprint recognition system, by introducing the Delta characteristics to get time-varying elements between each speech frame, and based the MFCC feature parameters to extend Delta features. With the expression of extended characteristic parameters improved, the subsequent calculation time increased, so the paper puts forward a kind of dimension reduction algorithm based mapping block combination.Experiments show that the feature parameters and processing methods are helpful to improve the yobustness of the system.In the fourth chapter, the voiceprint recognition model was studied. Aimed at the voiceprint recognition system described in the paper, mainly studied the Hidden Markov Model (HMM), including the solution of the problem in the process of implementation and the system robustness analysis. For the voiceprint recognition system with text-independent mainly studied the Gaussian Mixture Model (GMM), and improved the GMM model respectively in the stage of training and recognition. In the training stage a k-means algorithm based on adjacent rules for GMM initial value was presented, which overcame the disadvantage caused by traditional method that focus too much on a few indicators, by simplifying the maximum expected algorithm (EM) derivation and adding a correction coefficient method, the training speed and the recognition rate of the system improved. In the recognition stage, in order to avoid the influence to the verdict caused by the bad frame, a weighted frame scoring algorithm based on entropy was put forward and improved the robustness of the system.In the fifth chapter, the method of improving the efficiency of voiceprint recognition system was studied. Based on the idea of model clustering, a GMM model rapid recognition method based the model growth clustering is proposed, and a HMM rapid recognition method based statistical grouping of characteristic parameters is proposed as well. The clustering strategy of growth model clustering algorithm is to grow a multiple classes from a initial class to realize the speaker model clustering, the core algorithm of which is symmetric grouping strategy based on the concept of density set and the similarity criterion based on relative entropy as well as the class of GMM. For the voiceprint recognition system implementation by using the HMM model, in view of the difference between the HMM model and GMM model structure, the strategy is to cluster and grouping the characteristic parameters series, and gather HMM model obtained by training speech feature parameters in the same group into a group, which achieved the purpose of grouping the model library, and cleverly avoided the difficulty of clustering and grouping directly the model due to the HMM model structure, and the core algorithm is K-means algorithm based on adjacent rules and secondary smooth grouping algorithm and the similarity criterion based DTW (dynamic time warping) and class choice of characteristic parameters.In the sixth chapter, two voiceprint recognition technology application systems were proposed, which are mobile terminal voiceprint sign in system based on HMM and mobile phone voice print lock system based on GMM. Both systems used the techniques and algorithms about real time and robustness mentioned in the above chapter, and the development of the system was introduced at the same time, and the robustness and real-time performance of the two application system was tested respectively, the results proved the validity of the researched techniques and algorithms.
Keywords/Search Tags:voiceprint recognition, The multi-resolution spectrum, Characteristic parameters optimization, Characteristic parameters dimension reduction, GMM model, HMM model, The clustering growth, Density set, Characteristic parameters grouping, Robustness
PDF Full Text Request
Related items