Reducing computation in speaker recognition systems using a tree-structured universal background model

Posted on:2015-07-23

Degree:Ph.D

Type:Dissertation

University:New Mexico State University

Candidate:McClanahan, Richard Daniel

Full Text:PDF

GTID:1478390017993276

Subject:Electrical engineering

Abstract/Summary:

PDF Full Text Request

State-of-the-art speaker recognition systems utilize speaker models that are derived from an adapted universal background model (UBM) in the form of a Gaussian mixture model (GMM). This is true for GMM supervector systems, joint factor analysis systems, and most recently i-vector systems. In all of these systems, the calculation of posterior probabilities and of the sufficient statistics for the weight, mean, and covariance parameters represent a computational bottleneck in both enrollment and testing. In this dissertation, we have developed a method that utilizes a lower resolution GMM hash developed from clusters of GMM-UBM mixture component densities in order to reduce the computational load required. In the adaptation step we score the feature vectors against the hash and calculate the a posteriori probabilities and update the statistics exclusively for mixture components belonging to appropriate clusters.;Each cluster is a grouping of multivariate normal distributions and is modeled by a single multivariate distribution. As such, the set of multivariate normal distributions representing the different clusters also form a GMM. This GMM is referred to as a hash GMM which can be considered a lower resolution representation of the GMM-UBM. The mapping that associates the components of the hash GMM with components of the original GMM-UBM is referred to as a shortlist.;This research investigates various methods of clustering the components of the GMM-UBM and forming hash GMMs. Of five different methods that are presented, one method--Gaussian mixture reduction--outperforms the other methods in terms of reducing computation while preserving recognition accuracy. This method of Gaussian reduction iteratively reduces the size of a GMM by successively merging pairs of component densities using a metric based on the Kullback-Leibler divergence.;Evaluated with a Gaussian mean supervector SVM system and a single layer hash, our research achieves a factor of 2.77 reduction in a posteriori probability calculations with no loss in recognition when using a 250 component GMM-UBM. When clustering was implemented with a 1024 component UBM, we achieved a computation reduction of 5 x with no loss in accuracy and a reduction by a factor of 10x with less than 2.4% relative degradation in EER.;This hash system is extended in this research by employing a tree-structured GMM-UBM which uses Runnalls' Gaussian mixture reduction technique at multiple hierarchical layers, in order to further reduce the number of these probabilistic alignment calculations. With this tree-structured hash, we can reduce this computation by a factor of 14x while incurring less than 5% relative degradation of equal error rate (EER) with a state-of-the-art i-vector system.

Keywords/Search Tags:

Systems, Recognition, Speaker, GMM, Computation, Tree-structured, Using, Hash

PDF Full Text Request

Related items

1	Research And Speaker Recognition System
2	Researches And Implementation On Speaker Recognition Algorithms And Systems
3	Design And Research Of Speaker Recognition System Based On Speech Enhancement
4	Structured Approaches to Data Selection for Speaker Recognition
5	Studies On Speaker Recognition Based On SVM And GMM
6	Some Petri-Net-Based Methods Of Verifying Computation Tree Logic
7	Research On The Discrimination Issue In Speaker Recognition
8	Research On Speaker Recognition Based On VPT And GMM
9	Research On Speaker Recognition Over Short Utterance And Varying Channels
10	Research On Adaptive Methods For Text-independent Speaker Recognition