Font Size: a A A

Speaker Recognition Research Based On Improved Mel Feature Extraction Algorithm

Posted on:2018-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:L NiFull Text:PDF
GTID:2348330569486504Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Speaker recognition as a branch of speech recognition,which purpose is to identify the identity of the speaker based on the speaker's voice information.Recent years,identity authentication technology and mobile Internet technology have become more and more popular;speaker recognition is gradually from the laboratory environment to practical application.Therefore,this thesis studies the state-of-the-art technology of speaker recognition system,which has important theoretical significance and practical application value.First of all,the article has carried on the comprehensive research to the related basic theory of the speech signal.In the process of the pre-emphasis of the speech signal,traditional signal subtraction method acquires the poor performance of the problem of noise suppression at low SNR.Consequently,a speech enhancement method based on auditory perception is proposed.Compared with the typical method,the improved method can effectively improve the noise immunity of the signal at low signal to noise ratio.Secondly,an end point detection method based on the combination of fuzzy entropy and improved correlation vector machine is proposed to solve the problem of lower quality of endpoint detection under different SNR,the fuzzy entropy of each frame signal is extracted,which as the input vector of the correlation vector machine.At the same time,to solve the problem of poor robustness of single kernel function to predictive classification,adaptive multi-core combination of different kernel functions is established to fuse the properties of multinuclear kernel functions,and the classification accuracy and robustness have improved largely.The experimental results show that the endpoint detection based on fuzzy entropy and improved correlation vector machine can detect the endpoint of speech more efficiently even in adverse SNR environment.Then,two kinds of characteristic parameters,namely linear predictive cepstral coefficient and Mel frequency cepstrum coefficient,which are commonly used in feature extraction of speech signal,are proposed.To solve the problem of inferior performance degradation of the MFCC in the noisy environment,the Gamma-chirped filter cepstral coefficient is proposed.At the same time,the pitch frequency of each frame is extracted,and the GCFCC and pitch frequency data are merged.Then kernel principal component analysis(KPCA)has been utilized for the goal of conversion dimensionality reduction.The experimental results show that the improved feature extraction algorithm proposed in this thesis has a certain degree of improvement in recognition rate and computational complexity performance.Finally,this thesis studies the construction and analysis of the acoustic model system of speaker recognition.In order to estimate the parameter and initialize the acoustic model of the speaker recognition system,the maximum likelihood estimation and the expectation maximization algorithm are proposed.Aiming at the problem of the local solution of the initial parameters of the model,the K-means clustering algorithm is proposed.Lastly,speech recognition system based on improved Mel feature is constructed,and the feasibility of the system scheme is verified by experiments.
Keywords/Search Tags:speaker recognition, speech enhancement, Fuzzy entropy, GCFCC, GMM-UBM
PDF Full Text Request
Related items