Font Size: a A A

Research On Conversion Algorithm From Whispers To Normal Speech Based On Extended Bilinear Transform

Posted on:2011-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:X D TanFull Text:PDF
GTID:2178360305976310Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
Whisper is a common form of speech communication way that one uses for a variety of reasons. And research on conversion algorithm from Chinese whispered speech to normal speech is an interdisciplinary field which comes down to signal processing, artificial intelligence, pattern recognition, acoustics, etc. So it has important research value both at theory and application.Supported by the National Science Foundation of China Project Research on study of whispered speech enhancement and whispered-to-normal-speech conversion and as a part of it , the thesis studies some researches such as the Linear Prediction method, the Homomorphic Signal Processing method, the Linear Spectral Pair constant-shifting method and Radial Basis Function neural network method, and proposes a conversion algorithm from whispered speech to normal speech, which uses extended bilinear transform function to carry through the non-linear frequency shifting. The contents of this thesis are discussed as follows:The performance of whispered-to-normal-speech conversion would drop if the accuracy rates of endpoints detection and the Initial/Final (I/F) segmentation are not high enough. Based on non-linear characteristics of speech and Hilbert-Huang Transform (HHT), this thesis adopts instantaneous energy frequency value of HHT to detect the endpoints and I/F points of whispered speech and adopts Empirical Mode Decomposition (EMD) entropy value to detect the endpoints, and gets higher detection rates.Aiming at the problem that the shifting degrees of formants in different frequency bands of whispered speech are different from each other, this thesis segments the spectrum of whispered speech to make the mapping function more precisely. When whispered speech is converted to normal speech, the frequencies in different frequency bands have to be shifted non-linearly. This thesis improves bilinear transform by introducing an extended factor, which can promote the non-linear frequency shifting bilinear transform and make the shifting fitter in requires of the actual conversion from whispered speech to normal speech. And it can decrease the distance of spectrum between converted speech and normal speech.A whispered speech conversion system has been designed finally. The results and analysis of the experiment show that, the system can convert whispered speech successfully with high quality and intelligibility.Finally, this thesis raises the shortcomings of this method and the problems that haven't been solved, and gives the direction of further study and improving.
Keywords/Search Tags:conversion from whispers to normal speech, extended bilinear transform, instantaneous energy frequency value, EMD entropy value, endpoint detection, initial/final segmentation
PDF Full Text Request
Related items