Font Size: a A A

Study On Reconstruction Of Chinese Whisperea Speech

Posted on:2014-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:C HuangFull Text:PDF
GTID:2248330398979910Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Whispered speech is a special communication style between Humans. The whispered speech has some characteristics different from normal speech, such as no vocal cords vibration, pitch disappearing and lower speech energy than that of the phonated speech. The special voicing style of whispery decreases the speech intelligibility and the speech quality. Reconstruction of Chinese whisper speech is very important both in the theory and the application aspect.In this thesis, reconstruction of continuous whispered speech in Chinese to normal speech is studied systematically. Major works are as follows.A model of converting continuous whispered speech to phonated speech is built based on the Mixed Excitation Linear Prediction (MELP) model. The parameter values of the MELP model of the whispered speech and the corresponding voiced speech are analyzed.A new endpoint detecting method based on the information entropy in the time-frequency domain by Gabor transform is proposed in the present thesis. Since mistakes may be taken in the classification of the voiced and the unvoiced part in the whispered speech because of its lower energy, the zero-crossing rate method is used when the entropy of a speech frame is closed to the entropy threshold.An algorithm of splitting the initials and the finals which is based on the symmetrical relative entropy in the Gabor time-frequency domain is presented. The proposed method splits the initials and the finals according to energy focus, the formant structure and the spectrum distribution difference.In order to modify the formant of the whispered speech, the line spectrum pair (LSP) based method is realized. The formant amplitude and the bandwidth are modified using the LSP parameter values in the proposed MELP model. To obtain pitch information for whispered speech reconstruction, the tone of the whispered speech is classified based on the relation of the speech tone and the Bark frequency band power spectrum of the whispered speech. The five-degree tone model is used thereafter to add the pitch to the corresponding speech frame. To this end, the voiced version of a whispered speech is reconstructed using the MELP model with the modified parameters values. The experimental results show that the synthesized voiced speech is fluent which confirms that the proposed methods are effective.
Keywords/Search Tags:Mixed Excitation Linear Prediction Model, Endpoint Detecting, Splitting of Initialsand Finals, Formant Modification, Adding Pitch, Reconstruction of Continuous Whisper Speech
PDF Full Text Request
Related items