Font Size: a A A

Speaker Identification Of Whispered Speech Based On Joint Factor Analysis

Posted on:2015-02-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:C H GongFull Text:PDF
GTID:1268330428998168Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speaker Identification (SI), an important part of biometrics identification technology, iswidely used in public safety, judicial system, biomedical engineering, etc. It has made greatprogress with the rapid development of computer science and network technology.Nowadays, the study on whispered speech focuses not only on its fundamental research,but also on its applications. Speaker identification of whispered speech is an interestingwhile challenging task. So many issues are still to be resolved as to its particulararticulation.This paper pays special attention to text-independent speaker identification of whisperedspeech. The difficulties are as follows. First, the database of whispered speech is faulty,unlike the voiced one, whose database is provided by NIST for the research of SI. Secondly,with the characteristic of whisper, some parameters are not available, and some are moredifficult to be abstracted. What’s more, as the excitation of whispered speech is exhalation,it is more easily to be affected by noise. Meanwhile, as the whispered speech is oftenencountered in mobile communication, it is often influenced by its channel. Finally, whenwhispering, the speaker might be restricted by the surroundings, which will lead to changeof speaking mode or psychological factors. Hence, whispered speech is more likely to beaffected by the state of speakers. In short, the obstacles of SI for whispered speech are: thedifficulties in obtaining the parameters, the affections from the channels and the states ofspeakers as well.The contributions of this dissertation to speaker identification of whispered speech are asfollows:1. The algorithms for abstracting the parameters to represent the characteristics ofwhispered speakers are proposed. As there is no fundamental frequency in whisperedspeech, the reliability of the abstraction of formants is essential. Formant estimation ofwhispered speech based on spectral segmentation is proposed. This algorithm candynamically segment the spectrum, and get the parameters of the inverse filters bylinear prediction. It can solve the merged and shifted formants, which might often beencountered in the whispered speech. On the other hand, the SFMB and SCB are defined to represent the speakers’ trait of whispered speech. It is based on the propertythat the central and flatness can figure the stability of signal.2. Speaker identification of whispered speech based on feature mapping and speakermodel synthesis are proposed. They are smart ways to solve the mismatch betweentraining set and test set from the speakers’ state. As the whispered speech, compared tothe voice one, is weaker in delivering emotions, the classification of A, V factors forwhispers is proposed in this dissertation. It can also be taken as the pre-process for SIof whispers. The experimental results show the algorithms based on feature mappingand SMS are efficacious to the SI of whispered speech with perceptible mood.3. Speaker identification based on latent factor analysis of whispered speech withperceptible mood is proposed. It offers a probability to the speakers’ statecompensation. The factor analysis doesn’t care about the physical meanings of eachfactors, it’s only a mathematical way to find the representative factors from the massvariables. By plus or minus the quantities of the factors, it can adjust the complexitiesof the algorithms. As to the latent factor theory, the supervectors of whispered speechcan be decomposed into speaker and speakers’ state supervectors. It needs balanceddata to train the space of the vectors mention above. In the test stage, the speaker’ssupervector should be estimated from each session. The latent based algorithm canobtain better recognitions by avoiding the classification of speakers’ state.4. Speaker identification of whispered speech based on the joint factor analysis isproposed. It’s a compensation algorithm for SI of whispers with perceptible mood anddifferent channels. According to the JFA theory, the supervectors of speech signal canbe decomposed into speaker, speaker’s state and channel supervectors. As the trainingset is not large enough, it is not available to estimate the spaces of the supervectorsmentioned above. Hence, the procedures are: estimate the UBM model, calculate theBaum-Welch statistics, obtain the speaker’s space and get the speaker’s state andchannel space at the same time. When testing, the abstracted parameters should bytransformed by eliminating its channel factor and speaker’s state factor as well. Theexperimental results show the superiority of this algorithm compared to othercompensation methods.
Keywords/Search Tags:Speaker Identification of whispered speech, Feature Mapping, Speaker ModelSythesis, Joint Factor Analysis, the state of speaker
PDF Full Text Request
Related items