Font Size: a A A

Research Of Improving Naturalness In Speech Synthesis

Posted on:2017-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y K GeFull Text:PDF
GTID:2308330488982280Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech synthesis is to translate the text information into acoustic information. It is widely used in the filed of car navigation, blind assist and information inquiry. As one of the core technology in human-computer interaction, speech synthesis has became a hot topic in the field of natural language processing in recent years. This paper is based on the statistical parametric speech synthesis with HMM and the improvement of acoustic model in speech synthesis is the main object of study, the specific content is as follows:(1) In traditional speech synthesis algorithm, a parameter-fixed postfilter is used to alleviate spectral over-smoothing. In order to overcome the inaccurate resulting from fixed postfilter parameters, the improved speech synthesis with adaptive postfilter parameters is proposed. The relation of spectral flatness which represent the degree of spectral distortion and postfilter parameters is fitted. At the synthesis stage, postfilter parameters are adapted to variations in spectral flatness that obtained from the speech to enhance the formants of spectrum. Simulation experiment results demonstrate that the method can alleviate spectral over-smoothing and subjective tests shows that the speech naturalness is improved.(2) The excitation used in traditional speech synthesis algorithm is either a pulse train or white gaussian noise during voiced and unvoiced segments respectively and the speech sounds buzzy. An improved speech synthesis algorithm with harmonic plus noise excitation model is proposed to enhance the quality of speech. After inverse filtering, the harmonic signal in glottal flow is extracted and modeled by LSP coefficients. The LSP coefficients are sent into HMM training as the harmonic feather. In synthesis stage, the harmonic part and the noise part are reconstructed from the newly generated coefficients and mixed together as the excitation of speech signal. Simulation experiment results demonstrate that the excitation generated by our method is more accuracy and the naturalness of speech is improved.(3) The speech synthesis algorithm with sinusoidal model model the feather of amplitude information while discard the phase information because it is difficult to be quantization. A phase representation for sinusoidal in speech synthesis is proposed. The RCC is estimated for amplitude modeling and RPS is estimated for phase modeling respectively. Because it is hard to be statistical model directly, the RPS is unwrapped and converted into spectral envelope parameters as the phase feathers. It is convenient to recover phase information from the phase feathers in synthesis stage. Simulation experiment results demonstrate that the using of phase feather can improve the naturalness of synthesized speech.
Keywords/Search Tags:speech synthesis, naturalness, postfilter, harmonic plus noise model, phase modeling
PDF Full Text Request
Related items