Study On The Conversion Of Whispered Speech Into Normal Speech By Feature Mapping

Posted on:2019-03-29

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Dou

Full Text:PDF

GTID:2348330542497627

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Whisper is a special communication style between human beings.The vocal cords does not vibrate when people are pronounced.The excitation source of whispered speech is a noise like turbulence from lungs,resulting in a lack of fundamental frequency.Meanwhile,the energy of whisper is generally 20dB lower than that of the normal voiced speech.Conversion of whisper to normal speech is such a technology that is widely used in mobile communication,medical equipment,security monitoring and crime identification.This thesis mainly focuses on how to select the phonetic feature which are effective and efficient for whisper to normal conversion issue.The main research contents are as follows.First,a whisper to normal conversion method based on Mel Frequency Cepstrum Coefficient(MFCC)feature is proposed.In recent years,more and more researchers adopt statistical distribution characteristics of speech signal and probability methods such as Gaussian Mixed Model(GMM)is used to perform the vector mapping from the source to the target speech.However,as far as we know,there is little report about whisper to normal speech conversion based on GMM and MFCC.Unlike existing whisper to normal speech conversion method,the proposed method do not estimate the fundamental frequency.In order to better characterize the sparse property of whispered speech,we proposed a L1/2 algorithm for normal speech reconstruction from the estimated MFCC feature.Second,a novel method for whisper to normal speech conversion based on low dimensional feature mapping is proposed.Compared with the MFCC feature,the speech spectrum envelope contains more speech information.However,it is computational complexity when modeling the envelope relationship between the whisper and its normal counterpart.To tackle this problem,we used an auto encode(AE)to obtain low dimensional representation of whisper and normal speech envelope.Then neural networks were used to model the low dimensional feature relationship between whisper and normal and the relationship between normal speech fundamental frequency and low dimensional envelope feature of the whisper.In the conversion stage,the estimated low dimensional spectral envelope features are decoded by AE to restore the spectral envelope.The experimental results show that the natural and intelligibility of speech gains improvement compared with traditional methods.

Keywords/Search Tags:

whisper to normal conversion, spectral envelope, auto-encode, neural networks, MFCC feature

PDF Full Text Request

Related items

1	Research On Whisper To Normal Speech Conversion Based On Deep Neural Networks
2	Research On Whisper To Normal Speech Conversion Based On Convolutional Neural Network
3	Research On Whisper-to-Normal Speech Conversion Based On Generative Adversarial Network
4	Studies On Key Techniques For Voice Conversion
5	Whisper To Speech Conversion And Whisper Recognition Modeling Method
6	Research On Modelling And Conversion Of Segmental Feature
7	Voice Conversion Research Based On Spectral Envelope And Super-segmental Prosody
8	Study On The Neural Network Modelling Method For Voice Conversion
9	High-quality Voice Conversion From Non-parallel Corpora Based On Variational Auto-encoder And Bottleneck Feature
10	Voice Conversion Based On ANN