Research On MFCC Characteristic Parameters And Kernel Function Selection Based On Support Vector Machine

Posted on:2016-03-02

Degree:Master

Type:Thesis

Country:China

Candidate:H J Zhao

Full Text:PDF

GTID:2208330473462282

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Speaker recognition is an inherent physiological or behavioral characteristic of people, which is more simple, economical and convenient compared with other biological recognition, such as fingerprintã€face and iris. Speaker recognition forms in the development of the theories and technologies including signal detection and processingã€pattern recognitionã€artificial intelligence and machine learning and so on, which is a comprehensive course involving physiology, psychology, acoustics and phonetics.The performance of speaker recognition system is affected by the next main factors: firstly, the feature parameters. There are many parameters, but Mel frequency cepstral coefficient (MFCC) has the characteristics of not providing any assumption and limitation for input signal, not depending on the nature of signal and using the research result of auditory model, which conforms to the actual sound effect. Thus it has great performance and robustness when replacing the human ear to analyze voice. So, MFCC is chosen in this paper. Secondly, the recognition model. Recognition models are variety but support vector machine (SVM) has obvious advantages in small sample, nonlinear, local minimum and high dimensional pattern recognition and strong ability to adapt to the fresh samples. So SVM is used in this paper.The processing of MFCC feature parameters and the optimization of SVM kernel function are researched here, mainly including the following four aspects:(1)This paper analyzes the effect of selection of speech framing, pre-emphasis coefficient, sampling frequency and the number of Mel filter in speech pre-treatment on speech classification rate. One of them is set as variable and others are constants. This experiment shows that:MFCC parameters obtained are more robust when making frame length N=512, frame shifting M=170, pre-emphasis coefficient a=0.91, sampling frequency f=16 KHz and number of Mel filters m=24 in speech pre-treatment.(2)Many experiments have proved that the MFCC parameters in the front several dimensions influence the classification performance greatly, but not considering the effect of the front several groups on the speech classification rate. This work is studied here and it shows that:using all 200 groups could obtain the highest classification.(3)This paper analyzes the effect of the type of kernel function and the selection of kernel parameters on classification performance of SVM. This experiment shows that: using RBF kernel function in SVM could obtain the highest classification rate. And using the optimal kernel parameters chosen by Grid Search and K-fold Cross Validation to train SVM could obtain better classification rate than those chosen by experience.(4)In the MFCC, a large number of redundant information existed among each dimension. Mean impact value (MIV) method is introduced to select features of MFCC. With this method, redundant features can be removed. In the past, when using MIV, the floating value that original MFCC parameters added/subtracted was just 10%. Now the floating value rises to 30%ã€50%ã€70%ã€90%. This experiment shows that: adding/subtracting 90% on the basis of the original MFCC parameters could obtain the best speech classification rate. And using the highest 10 dimensions of MFCC parameters to train SVM could obtain better speech classification rate and shorter running time than using all 16 dimensions of MFCC.Through this study it can be seen that after using a series of pre-treatment and MIV to reduce the dimensions of MFCC, the MFCC parameters are more robust. And only select the exact kernel function and the appropriate kernel parameters, can it obtain the best classification performance of SVM.

Keywords/Search Tags:

speaker recognition, voice pre-treatment, support vector machine, Mel-frequency cepstral coefficient, mean impact value

PDF Full Text Request

Related items

1	Discrimination Based On Support Vector Machine Speaker
2	Speaker Recognition Based On Support Vector Machine
3	Study On The Technologies Of Text-independent Short-duration Speaker Recognition
4	The Research Of Speaker Recognition Under Noisy Environment
5	Speaker Recognition Research In Noisy Environment
6	Research On Voice Endpoint Detection Method In Noisy Environment
7	Research On Speaker Separation And Recognition Of Conference Voice Based On Voiceprint Recognition
8	Research On Feature Extraction Algorithm In Speaker Recognition
9	Research On Speech Recognition Feature Extraction Algorithm And Soundprint Attendance System Realization
10	The Research Of Speaker Recognition Based On Vector Quantization