Font Size: a A A

Mixed Features And Gaussian Mixture Model-based Speaker Recognition Study

Posted on:2010-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiaoFull Text:PDF
GTID:2208330332976818Subject:System theory
Abstract/Summary:PDF Full Text Request
As we know, voice is the most direct and convenient way to achieve communication among people, while achieving the straightway voice communication between people and computer has been an unremitting dream that mankind has done a lot of work to pursue. Speaker Recognition is an important research aspect of speech signal, and has important theoretical significance and widespread application prospects. In essence, the speaker recognition technology can be mainly divided into feature extraction and recognition model. Therefore, the mostly problem existed in the speaker recognition study can be attributed to the limitations of feature extraction and recognition model in some sense. The massive research indicated that the current speaker recognition problems mainly come from the speaker's feature extraction, and how to seek a new speech feature with more expressive personality characteristics and more robustness, or enhance the performance of existing systems by optimizing the existing features via choice, integration, compensation and other methods, are still some important problems urgently need to be solved in the field of speaker recognition technology.This paper has done some research on the speaker recognition system based on the mixed features constructed by Fisher rule and Gaussian Mixture Model, and the mainly work is as follows:(1) We introduce the related concepts, the basic principles and the general steps of the speaker recognition system.(2) In the traditional speaker recognition system, voice fragments after pretreated are generally and directly to carry out the feature extraction, which would bring a flaw. Voice fragments generally have some silent sections, if we extract features directly without eliminating these silent sections, the voice features of the silent sections will be included in the extracted features, and these voice features of the silent sections would bring some interference to the speaker recognition and reduce the correct recognition accuracy. In order to solve this problem, we add the process of endpoint detection between voice fragments after pretreated and extract features. The purpose of speech endpoint detection is to eliminate the silent sections of voice segments, which can not only reduce the computation time but also exclude the interference of the silent sections, and enhance the correct recognition accuracy of speaker recognition. We also introduce the process of the endpoint detection and some common feature extraction and their implementation of MATLAB in detail. After feature extraction, the selection of traditional feature is used one kind of features or a simple combination of several features, which may contain some redundant information or some interferential information to recognition performance, then these information would reduce the recognition performance. Therefore we use Fisher rule to select the feature parameters on the basis of the combination of LPCC, MFCC and their Delta features, which can select the best category separability feature parameters while simultaneously eliminating the redundant information or those information which have interference to recognition performance, to achieve the purpose of dimensionality reduction and enhance the recognition performance, then we give its implementation of MATLAB.(3) We introduce the basic concepts, EM algorithm, Initialization parameters, and system criteria of the speaker recognition system on the basis of Gaussian Mixture Model and their implementation of MATLAB in detail.(4) In order to achieve the highest performance of the speaker recognition system, through experiments and analysis, we first need determine the best basic elements of the speaker recognition system such as the order of LPCC, MFCC, and the order of GMM and the length of test voice. Then we study the influence of feature selection through the use of Fisher rule to choose the characteristic one kind of features or the combination of several features on the recognition performance. It's proved that using Fisher rule to select the feature parameters can enhance the recognition performance, and give the best combination of feature parameters to achieve the best recognition performance.
Keywords/Search Tags:speaker recognition, LPCC, MFCC, GMM, Fisher rule
PDF Full Text Request
Related items