Font Size: a A A

HMM-based Pronunciation People Switching System

Posted on:2014-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y J XiongFull Text:PDF
GTID:2268330401953927Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In the field of speech signal processing, voice conversion is a relatively new technology, the goal is to change the voice of a speaker’s personality traits, so that it has another speaker’s voice personality traits. The pronunciation conversion is the basis of their voice conversion, such as multimedia entertainment; sound camouflage and so on On the other hand, with the development of speech synthesis technology, the sound quality and naturalness of the synthesized speech has been significantly improved in speech synthesis system. The users also put forward a variety of requirements, such as the multiple pronunciation, a variety of pronunciation styles, multiple emotions and Multilingual synthesis. Based on HMM speech synthesis, speech recognition technology and the voice dates of four enunciators, we research the pronunciation conversion system. It includes the main work as follows:(1) It describes the researching background and situation of the pronunciation conversion and voice conversion, introducing the training speech synthesis based on HMM, discussing the speech synthesis technology on this basis, then achieves the experimental program of pronunciation convert.(2) In order to train the HMM speech synthesizer, the training statements" initials and finals time mark and rhythm marker are needed. In order to use the HMM synthesizer speech waveform, you need to be synthesized by statement sound vowel sequence and rhythm mark. As first part of the pronunciation conversion system, it carries on the voice recognition technology, auto-tagging initials’and finals" time. Based on the statistical analysis of the acoustic vowel’s length of time, prosodic marking rules, and auto-tagging programming rhythm.(3) It gives the programs HMM-based on the speech synthesis to build voice conversion system and processes, including the data preparation, the design context attributes and questions, training HMM synthesizer, pronunciation conversion and so on.Experimental results show that:the use of initials and finals time tagging information and the proposed rules, the auto-tagging of the rhythm, the acceptable rate can reach the requirements of the HMM speech synthesis. The initial MOS subjective evaluation in this pronunciation conversion system scores up to4.2, sets outside3.9, the synthesized speech naturalness converted basically reaches an acceptable level.
Keywords/Search Tags:Hidden Markov Model, pronunciation convert, the rhythm marked, speech synthesis
PDF Full Text Request
Related items