A nonlinear mixture autoregressive model for speaker verification

Posted on:2012-02-18

Degree:Ph.D

Type:Dissertation

University:Mississippi State University

Candidate:Srinivasan, Sundararajan

Full Text:PDF

GTID:1458390011956213

Subject:Statistics

Abstract/Summary:

In this work, we apply a nonlinear mixture autoregressive (MixAR) model to supplant the Gaussian mixture model for speaker verification. MixAR is a statistical model that is a probabilistically weighted combination of components, each of which is an autoregressive filter in addition to a mean. The probabilistic mixing and the data-dependent weights are responsible for the nonlinear nature of the model. Our experiments with synthetic as well as real speech data from standard speech corpora show that MixAR model outperforms GMM, especially under unseen noisy conditions. Moreover, MixAR did not require delta features and used 2.5x fewer parameters to achieve comparable or better performance as that of GMM using static as well as delta features. Also, MixAR suffered less from over-fitting issues than GMM when training data was sparse. However, MixAR performance deteriorated more quickly than that of GMM when evaluation data duration was reduced. This could pose limitations on the required minimum amount of evaluation data when using MixAR model for speaker verification.

Keywords/Search Tags:

Model for speaker verification, Nonlinear mixture autoregressive, Mixar model, Evaluation data

Related items

1	Research On Text-Independent Speaker Verification System
2	Speaker Verification Based On Sorted GMM
3	Speaker Verification Based On Limited Speech Data
4	Speaker Verification In Multi-Channel Condition
5	Research On Channel And Duration Mismatch Compensation For Speaker Verification
6	Speaker Recognition Based On Adaptive Gaussian Mixture Model
7	Studies On Speaker Recognition Based On SVM And GMM
8	Study On Speaker Verification Technology Related To Text And Applications
9	Text-Dependent Speaker Verification System
10	Research Of Speaker Verification In The Channel Mismatch Conditions