Font Size: a A A

Research On Feature Transformation And Robust Technology With Speaker Identification

Posted on:2009-03-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:L M XuFull Text:PDF
GTID:1118360245979309Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
This dissertation focuses on the research on Transformation-based Gaussian mixture model, weighted features compensation transformation and adaptive histogram equalization to improve the performance of speaker identification and the robustness in practical application environment. Including:1. A multi-step clustering algorithm with transformation-based and diagonal-covariance Gaussian mixture model (GMM) is advanced. In order to simplify the computation, Gaussian mixture density functions always use diagonal covariance matrices. However this also reduces the likelihood of the data, which could consequently affect the classification decision. In order to compensate the losing likelihood, the multi-step clustering algorithm is proposed. In this algorithm, the embedded linear transformation is used to integrate both transformation and diagonal-covariance Gaussian mixture into a unified framework. Also a multi-step cluster algorithm is integrated into the estimating process of GMM to search the appropriate mixture number. Compared with, the estimation frequency is obviously reduced. Compared with the traditional cluster expectation-maximization (EM) algorithm, the newly proposed method can save 50% of time and the error rates decrease by 1.4% on average on the same database. Compared with the transformation embedded GMM, the experiment with two databases indicate that the method reformed in the paper can directly reach the best point of saturation with the right mixture number.2. A weighted features compensation transformation method based on GMM for robust speaker verification is presented. In the method, the scores of features are weighted through frame SNR, while the frame likelihood probabilities are transformed based on the acoustic characteristic of speaker recognition system. In stationary and non-stationary noise environment with different SNR, compared with the features weighted algorithm, this proposed method can achieve the average recognition rate increase by 2.74% and 2.82%, while the method have the average recognition rate increase of 3.56% and 1.34% compared with the normalization of compensation transform method on the same database. On the another open database, the increments are 3.02% and 2.56% compared with the features weighted algorithm, while compared with the normalization of compensation transform method, the increments are 3.9% and 1.14%.3. Based on the statistical characteristics of speaker feature and the particularity of histogram equalization applied to speaker recognition, the adaptive histogram equalization (AHEQ) method for speaker recognition is presented. In this method, the cumulative histogram function is first created with the wide range and then According to the frequency range eigenvalue increment from the size of the interval to determine the need for further delineation and demarcation level. This approach not only reduce the amount of computation, but also the transformation of the eigenvalues more in line with the actual distribution of feature space, making it possible to further improve the recognition rate and robust of Speaker Identification System in noise environment. In the same database, the study used two classic noise (that is, White and Babble), compared with ordinary histogram equalization method, the average recognition rate of AHEQ is increased by 3% and 2.9%. In another comparison testing focused, the performance of the adaptive histogram equalization method is similar improvement.
Keywords/Search Tags:Speaker identification, Feature transformation, Multi-step clustering, weighted features compensation transformation, Adaptive histogram equalization, Noise robustness
PDF Full Text Request
Related items