Characteristic Analysis And Its Application Of Chinese Whispered Speech

Posted on:2008-01-09

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Pan

Full Text:PDF

GTID:2178360218950474

Subject:Signal and Information Processing

Abstract/Summary:

Whisper is a human's special pronunciation different from the normal speech. It is a single pronouncing type, the forepart of glottis is close and the back-end of glottis is a triangle cranny. The air stream produces friction noise through the open part of the glottis. So whisper has no vocal fold vibration, no pitch and its energy is much lower than that of normal speech. Whisper is a common form of speech communication way that one uses for a variety of reasons. For example, with the mobile telephone used widely, people often need to call in a whisper at public for avoiding disturbing others and assuring the secrecy of communication. And in some professional fields whisper is also useful, the linguistician may study people's perceptual mechanism through whisper, the normal speech can be reconstructed for the patient whose larynx is cut, and the speaker can be identified through whispers.As its characteristic and the influence of the environment, the whisper is hard to comprehend and the quality must be poor. The whisper's application scope should be expanded, when the speech can be reconstructed from the whisper. This paper is enclosed the pre-process of whispers. There are three important parts in the paper: endpoint detection, initial/final division, and tone perception of whisper.The high correct rate of endpoint detection of whispers makes the following work much easier. According to the typical chaos of whisper, a fractal based endpoint detection algorithm is proposed. In term of the energy of intrinsic mode function (IMF), a fitting characteristic of empirical mode decomposition (EMD) is introduced into the detection. Both of the methods show the great effect.As a tone language, speaker's meaning and emotion is expressed through the tone of Mandarin. So the pitch should be added to the final segment before the speech reconstruction. And so the initial/final should be divided when the whisper is detected. A voice onset time (VOT) inspection is provided by the algorithm of improved EMD. The detail-approximation energy ratio (DAER) based on the wavelet decomposition and the instant frequency of IMF are proposed respectively to separate the initial/final segment. The fitting curve of energy proportion of diffused Bark spectrum, a new carrier of whisper tone, is discovered. A high correct rate of 78% is obtained when the new carrier used in tone recognition. Then the tone information is delivered to the next step, normal speech synthesis.Those works not only provide the necessary parameters for the speech reconstruction, but also do some contribution to the study on the characteristic of pronunciation and perception of people in the digital signal processing field.

Keywords/Search Tags:

Whisper, Fractal, Empirical mode decomposition (EMD), Instant frequency, Diffused Bark spectrum

Related items

1	Empirical Mode Decomposition Theory Of Ship Radiated Noise Line Spectrum Analysis
2	Study On Time-Frequency Analysis Method Based On Empirical Mode Decomposition
3	Theroretical Study And Application Of Time-Frequency Analysis Method Based On Empirical Mode Decomposition
4	Research On Empirical Mode Decomposition Algorithm And Its Application In Electromagnetic Imaging
5	Research On The Key Problems Of Bidimensional Empirical Mode Decomposition In Digital Image Processing
6	Research And Application Of Empirical Mode Decomposition And The Instantaneous Frequency Filtering Algorithm
7	Research And Application Of Time-Frequency Analysis Method Based On Empirical Mode Decomposition
8	Improved Algorithm And Application Investigation Of Empirical Mode Decomposition
9	Based On Hilbert-huang Power Spectrum To Extract The Blood Vessel Wall Pulsation Displacement Improvement
10	The Research On Empirical Mode Decomposition And Its Application In Image Segmentation