Font Size: a A A

Research On Robustness Of Speaker Verification Under Complicated Environments

Posted on:2016-11-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:J WangFull Text:PDF
GTID:1108330503456257Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The focus of this dissertation is the robustness issue in speaker verification under complex environments. The major contributions are:1. Proposed a channel-robustness feature based on the F-ratio criterion, and studied its generalization capability in the i-vector framework. The new feature reweights Fbank channels according to their discriminative information measured by the F-ratio criterion. The discrepancy between the supervised learning for F-ratio parameters and the unsupervised learning of the i-vector models was analyzed, and an LDA transform was proposed to recover the discriminative potential of the F-ratio approach. The method was comparatively studied in different recognition frameworks(GMM-UBM and the i-vector) and with different databases. The experimental results on the NIST SRE08 core test showed that the new feature outperformed the baseline MFCC feature by 12.2%.2. Proposed a discriminative training approach based on deep neural networks(DNN) to improve i-vector-based speaker recognition. This approach casts the speaker verification task to a binary classification problem where a pair of i-vector pairs is classified into either spoken by the same speaker or by different speakers. A DNN model was employed to conduct the classification, and the dimension-wised distances are used as the discriminative features. Experimental results on the NIST SRE08 core test showed that the DNN-based method, when combined with PLDA scores, outperformed the baseline PLDA-based approach by 11.8%.3. Proposed a sequential GMM-UBM adaptation approach based on MAP and feature MAP linear regression(f MAPLR). This method is proposed to address the serious performance degradation with time-various acoustic channels. With this method, the UBM and speaker models are continuously adjusted to learn the changed speaker/channel information. In addition, a new feature-space sequential adaptation approach based on feature MAP linear regression was proposed to update features sequentially. The experiments conducted on the CSLT-Chronos database demonstrated that the proposed approach leads to a significant EER reduction with 25.0% and 39.0% respectively two mismatched conditions.
Keywords/Search Tags:Speaker verification, Robustness, Fbank weighting, Discriminative training, Sequential adaptation
PDF Full Text Request
Related items