Font Size: a A A

Characteristic Analysis And Its Application Of Chinese Whispered Speech

Posted on:2008-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:X Y PanFull Text:PDF
GTID:2178360218950474Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Whisper is a human's special pronunciation different from the normal speech. It is a single pronouncing type, the forepart of glottis is close and the back-end of glottis is a triangle cranny. The air stream produces friction noise through the open part of the glottis. So whisper has no vocal fold vibration, no pitch and its energy is much lower than that of normal speech. Whisper is a common form of speech communication way that one uses for a variety of reasons. For example, with the mobile telephone used widely, people often need to call in a whisper at public for avoiding disturbing others and assuring the secrecy of communication. And in some professional fields whisper is also useful, the linguistician may study people's perceptual mechanism through whisper, the normal speech can be reconstructed for the patient whose larynx is cut, and the speaker can be identified through whispers.As its characteristic and the influence of the environment, the whisper is hard to comprehend and the quality must be poor. The whisper's application scope should be expanded, when the speech can be reconstructed from the whisper. This paper is enclosed the pre-process of whispers. There are three important parts in the paper: endpoint detection, initial/final division, and tone perception of whisper.The high correct rate of endpoint detection of whispers makes the following work much easier. According to the typical chaos of whisper, a fractal based endpoint detection algorithm is proposed. In term of the energy of intrinsic mode function (IMF), a fitting characteristic of empirical mode decomposition (EMD) is introduced into the detection. Both of the methods show the great effect.As a tone language, speaker's meaning and emotion is expressed through the tone of Mandarin. So the pitch should be added to the final segment before the speech reconstruction. And so the initial/final should be divided when the whisper is detected. A voice onset time (VOT) inspection is provided by the algorithm of improved EMD. The detail-approximation energy ratio (DAER) based on the wavelet decomposition and the instant frequency of IMF are proposed respectively to separate the initial/final segment. The fitting curve of energy proportion of diffused Bark spectrum, a new carrier of whisper tone, is discovered. A high correct rate of 78% is obtained when the new carrier used in tone recognition. Then the tone information is delivered to the next step, normal speech synthesis.Those works not only provide the necessary parameters for the speech reconstruction, but also do some contribution to the study on the characteristic of pronunciation and perception of people in the digital signal processing field.
Keywords/Search Tags:Whisper, Fractal, Empirical mode decomposition (EMD), Instant frequency, Diffused Bark spectrum
PDF Full Text Request
Related items