Font Size: a A A

Research On Microphone Adaptation Algorithm For Robust Speech Synthesis

Posted on:2014-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:N LiFull Text:PDF
GTID:2268330401984054Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
At the moment,the unit selection and waveform concatenative, which is basedon large-scale corpus, is the most popular method in the area of speech synthesis.While, unless we adopt speech corpus in high quality, the speech synthesized in thismethod cannot reach an exactly ideal state. Then to improve the synthesized speecheffect, it needs to cover the more features and speech data must derive from highcontrollable environment. Thus, we should enlarge the speech corpus to adapt to itslarger scale, and cost of constructing speech will increase due to the high requirementof recording environment.With the rapid development of the internet, it is easier to obtain the dataresources, such as the domestic and foreign news broadcast which its correspondingtext is very accurate too. If we can use these resources for constructing auto-speechcorpus, it will greatly reduce the cost. However, most of these easily-found voiceresources are not very clean that may have many factors of interference: the recordingcondition is not continuous and variable, recording backgrounds are not perfectlyquiet. Especially, the speech quality is inevitably affected by the microphone type anddiverse placements. If we still adopt the unit selection method, it will cause a lot ofproblems because some essential units that the waveform concatenation needed maynot exist. So in this paper, we investigated the robust synthesis method. According tothe current popular anti-noisy technology in speech recognition, we made systematicanalysis for noisy problem, and proposed that microphone adaptation algorithm canenhance speech quality efficiently. The main work and results were listed in thefollowing:1) We analyzed the noise forming reason, classified the noise type existed in theinternet speech corpus, proposed that noise caused by microphone can be treated by feature normalization method in cepstrum domain. As a result of our researches, wefound that Statistical parametric method based on HMM (Hidden Markov Model) ismore robust, and discussed detaledly the reason why this synthesis system is betterthan unit selection method. We make performance evaluation on the synthesis systemby MOS method, and analyzed noisy effect on the system using MCD results.2) We make HRTF process to pure voice data, simulating microphone effect incollecting internet voices. Owing to this method provide complete consistency oflength and content between pure speech corpus and noisy one, the speech effectsynthesized by these two corpus can be compared and analysis noisy effect on thevoice. In this paper, two noisy corpus were constructed, which were different fromeach other in the microphone effect parameters. It is found that the increase of themicrophone factors leads to the decrease of the voice naturalness.3) In this paper, we proposed that microphone adaptation algorithm can resolvethe reduced speech effect synthesized by noisy corpus. Our experiments confirmedthat the MOS and MCD results of voice synthesized by HTS system are basicallyconsistent. After adaptation algorithm treatment, the naturalness and intelligibility ofspeech synthesized by data containing noise are enhanced clearly. Meanwhile, MVNmethod exhibits better performance than CMN and RASTA methods in treating noiseproblems.
Keywords/Search Tags:robust speech synthesis, noisy speech corpus, microphone factors, microphone adaptation algorithm
PDF Full Text Request
Related items