Font Size: a A A

Speaker Identification Based On Independent Component Analysis And Genetic Algorithm

Posted on:2006-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2168360155452865Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the development of Information Technology, the means ofcommunication become more various, and the requirements of identityauthentication grow higher as a result. Biometrics, which can authenticate aperson's identity by his/her physiological and/or behavioral traits, shows itsadvantages in identity authentication area. Speaker identification, one kind ofbiometrics, refers to the concept of recognizing a speaker by his/her voice orspeech samples. It has been focused for its convenience, economy and veracity inthe last few years and has turned an important and widespread means of secureauthentication in daily life. However, there still are many problems to be solvedHow to improve the performance, especial the performance of text-independentsystem, is one problem. And dimension reduction to reduce the load of calculationand memory is another. In this paper, based on the study of recent advancementsand development of speaker identification, we try to solve the above two problemsby using feature transform and efficient codebook design algorithm.The main work of this paper includes the following:(1) Detection of Voiced RegionsThe collected speech samples include unvoiced, voiceless and voiced regions.Only voiced regions include the information of speakers. Therefore the detection ofvoiced regions in speech is an important process in speaker identification. Theclassical signal processing methods are applied to detect voiced regions based onthe assumption that speech signal is stationary. However, speech signal isnon-stationary, so the classical methods are ineffective sometimes. Wavelettransform is a new tool in signal processing which is fit for non-stationary signalsand Teager energy operator can efficiently detect the 'energy'of a signal. In thispaper, a new algorithm based on wavelet transform and Teager energy operator ispresented to detect voiced regions. Experiments were performed to evaluate theaccuracy and robustness of the proposed algorithm by using the clean speech andspeech contaminated with white Gaussian noise at different SNRs. Results showthat the proposed algorithm can accurately detect voiced regions and is robust towhite Gaussian noise.(2) Featurre Selection and Feature Transform Vector quantization (VQ) based classification algorithms play an importantrole in speaker identification systems. LBG algorithm is the most popularcodebook design algorithm. We use "16 Mel-frequency cepstral coefficients"and"16 Mel-frequency coefficients + 14 Mel-frequency differences"as the featurevectors. The speech samples used for training and testing were recorded from 20different speakers. Each of them spoke 50 Chinese phrases. The first 40 sampleswere used to generate the trained speaker classes and were used for testing intext-dependent system. The other 10 samples were used for testing intext-independent system. The highest correct rates of text-dependent andtext-independent system are respective 91.250% and 72.50% when "16Mel-frequency cepstral coefficients"was used as feature. When "16 Mel-frequencycoefficients + 14 Mel-frequency differences"was used as feature, the highestcorrect rates of text-dependent and text-independent system are respective 91.625%and 73.00%. Because the samples are limit, the performance of system with highdimensional feature space is reduced. Therefore, "16 Mel-frequency cepstralcoefficients"is used as feature in the following experiments. To improve the perfprmance of system and to reduce the dimension of featurespace, the speaker identification based on principal component analysis (PCA) isused. LBG algorithm is used to design codebook. The correct rates oftext-dependent and text-independent system are respective 91.625% and 73.00%,the correct rates are not increased compared with LBG algorithm; moreover whendimensions are reduced, the system performance deteriorates markedly. ThereforePCA_LBG can not solve the two problems in speaker identification. PCA can produce features mutually uncorrelated, while Independentcomponents analysis (ICA) can make features mutually independent. Independenceis a stronger condition than the uncorrelatedness. When ICA and LBG is applied tosystem, the performance of system is improved markedly for small size codebooks,which indicates that ICA can enhance the correct rates. However, LBG algorithmoften splits the Voronoi cell that has only one codevector. This phenomenon oftenappears when codebook size is large. That caused the performance of large sizecodebooks reduces evidently. (3) Genetic Algorithm Applied to Codebook Design Genetic algorithm (GA) provides high quality codebooks for vector...
Keywords/Search Tags:Speaker identification, Wavelet transform, Teager energy operator, Mel-frequency cepstral coefficients, Vector quantization, Principal component analysis, Independent component analysis, Genetic algorithm
PDF Full Text Request
Related items