Research On Feature Extraction Algorithm In Speaker Recognition

Posted on:2017-08-05

Degree:Master

Type:Thesis

Country:China

Candidate:W J Huang

Full Text:PDF

GTID:2358330512467948

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Speaker Recognition utilizes the different voice features between the different speakers to discriminate them, which is involved with the fields of physiology, acoustics and phonetics. Compared to iris recognition and fingerprint recognition, it is more simple and convenient. In speaker recognition system, the key problem is to extract the feature parameters of speakers exactly.The Mel Frequency Cepstral Coefficient (MFCC) was used in this paper. MFCC is analyzed on the basis of human auditory mechanism, which reflects the actual hearing effect of the ear. In the aspect of recognition model, the support vector machine (SVM) was selected because of its great advantage in pattern recognition problems of small samples and nonlinear. The following contents were researched in this paper according to the extracting process of MFCC parameter and the nonlinear characteristic of voice:(1) In view of the possible impact that window function in the Mel filter bank has on the recognition effect, the triangle window, harming window and hamming window were used to design the Mel filter bank and the result of simulation experiments showed that utilizing the hamming window to design the Mel filter bank can obtain better recognition performance.(2) Compared to Fourier transform, the wavelet analysis has great advantage in processing the nonlinear and non-stationary signal. On the basis of principle of wavelet transform and the corresponding relationship between nodes of wavelet packet decomposition tree and frequency band range of signal, the new feature parameter wavelet packet transform coefficient (WPTC) was obtained. Simulation experiments indicated that the recognition performance of new feature parameter WPTC is much better than MFCC.(3) The traditional MFCC feature parameter does not reflect the nonlinear characteristics of speech signal. The empirical mode decomposition (EMD) method was used to isolate the high frequency part of speech signal and the fractal dimension (FD) was utilized to express the nonlinear characteristics of high frequency information, then the characteristic parameter EMD-FD was obtained. The fusion of traditional MFCC and EMD-FD constituted a higher dimensional feature space. Simulation experiments indicated that the average recognition rate of parameter which is confused with the nonlinear feature was improved about 2%, when compared to MFCC.

Keywords/Search Tags:

speaker recognition, Mel Frequency Cepstral Coefficient, wavelet packet transform, Empirical Mode Decomposition, Fractal Dimension

PDF Full Text Request

Related items

1	Research On Some Key Issues Of Speaker Recognition In Noisy Environment
2	Empirical Mode Decomposition Theory Of Ship Radiated Noise Line Spectrum Analysis
3	Speaker Recognition Based On Wavelet Packet And The Theory Of Chaos
4	Hilbert-Huang Transform And Wavelet Methods For Time-frequency Analysis
5	Speaker Recognition Research In Noisy Environment
6	Research On Empirical Mode Decomposition Algorithm And Its Application In Electromagnetic Imaging
7	Research On Theoretical Algorithm Of Empirical Wavelet Transform And Its Application In Speech Signal Processing
8	The Research Of Speaker Recognition Under Noisy Environment
9	Design And Implementation Of A Speaker Recognition System
10	Study On Time-Frequency Analysis Method Based On Empirical Mode Decomposition