Font Size: a A A

Speaker Identification In Chinese Whispered Speech Based On Simplified Joint Factor Analysis

Posted on:2011-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y L WangFull Text:PDF
GTID:2178360305976673Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Whispered speech is the mode of speech defined as speaking softly with no vibration of the vocal cords to avoid being overheard. The whispering speaker recognition can be applied in several fields, such as the private speech communication in public, the special need for the forensic work, etc.Since speaker recognition of whispered speech is in the early stage research, many models which are often used in normal speech are still used. However, most of them are not suitable for whispered speech because of its characteristics.At present, the available adaptive compensation methods make no distinction between the speaker health, psychological factors and the channel environment factors, which will definitely affect the recognition results of whispered speech.As to whispered speech, without the vibration of the vocal cords, it is always in low SNR. The locations, energy of the formants and the auditory model in whispered speech are different from those in normal speech. When whispering, the mentality of the enunciator is varied and susceptible. Hence, speaker recognition of whispered speech becomes more sophisticated compared to the normal speech. Concerns are how to decrease the influence of speaking environment, especially the variations of speech channels; and how to remove the mental or emotional affections.For the characteristics of whispered speech, this paper presents a new approach to speaker identification of Chinese whisperd speech which called simplified joint factor analysis. The main idea of the proposed technique decoupled estimates the speaker space and channel space, which removes the necessity of labeling databases for channel, simplifies the training procedure and also reduces the computation and the demanding of data sets.Experiments are carried on our own database. This corpus consists of 100 target speakers, 80 male and 20 female, in which each speaker is recorded over 8 typical channels. Compared with different recognition methods, such as MAP, Feature Mapping + MAP and SMS, the proposed JFA technique which we presented in this paper does provide superior performance and significant speedup in speaker identification of Chinese whispered speech. Especially, it does greatly improve the recognition accuracy when the enrollment and test conditions are mismatched.Studying on the number of speaker factors and channel factors shows that increase the number of the factors properly can improve the recognition accuracy effectively, but there is a problem called saturation. That is to say keeping on increasing the number of the factors can not improve the performance of the whispered speaker recognition system.
Keywords/Search Tags:Whispered Speech, Speaker Recognition, Joint Factor Analysis, Speaker Factors, Channel Factors
PDF Full Text Request
Related items