Font Size: a A A

Study On The Conversion Of Whispered Speech Into Normal Speech By Feature Mapping

Posted on:2019-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y F DouFull Text:PDF
GTID:2348330542497627Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Whisper is a special communication style between human beings.The vocal cords does not vibrate when people are pronounced.The excitation source of whispered speech is a noise like turbulence from lungs,resulting in a lack of fundamental frequency.Meanwhile,the energy of whisper is generally 20dB lower than that of the normal voiced speech.Conversion of whisper to normal speech is such a technology that is widely used in mobile communication,medical equipment,security monitoring and crime identification.This thesis mainly focuses on how to select the phonetic feature which are effective and efficient for whisper to normal conversion issue.The main research contents are as follows.First,a whisper to normal conversion method based on Mel Frequency Cepstrum Coefficient(MFCC)feature is proposed.In recent years,more and more researchers adopt statistical distribution characteristics of speech signal and probability methods such as Gaussian Mixed Model(GMM)is used to perform the vector mapping from the source to the target speech.However,as far as we know,there is little report about whisper to normal speech conversion based on GMM and MFCC.Unlike existing whisper to normal speech conversion method,the proposed method do not estimate the fundamental frequency.In order to better characterize the sparse property of whispered speech,we proposed a L1/2 algorithm for normal speech reconstruction from the estimated MFCC feature.Second,a novel method for whisper to normal speech conversion based on low dimensional feature mapping is proposed.Compared with the MFCC feature,the speech spectrum envelope contains more speech information.However,it is computational complexity when modeling the envelope relationship between the whisper and its normal counterpart.To tackle this problem,we used an auto encode(AE)to obtain low dimensional representation of whisper and normal speech envelope.Then neural networks were used to model the low dimensional feature relationship between whisper and normal and the relationship between normal speech fundamental frequency and low dimensional envelope feature of the whisper.In the conversion stage,the estimated low dimensional spectral envelope features are decoded by AE to restore the spectral envelope.The experimental results show that the natural and intelligibility of speech gains improvement compared with traditional methods.
Keywords/Search Tags:whisper to normal conversion, spectral envelope, auto-encode, neural networks, MFCC feature
PDF Full Text Request
Related items