Research Chinese Speech Based On Speech Recognition And Speech Synthesis Conversion

Posted on:2014-02-16

Degree:Master

Type:Thesis

Country:China

Candidate:B He

Full Text:PDF

GTID:2268330401953153

Subject:Signal and Information Processing

Abstract/Summary:

Voice conversion is a relatively new technology in the field of speech signal processing, it is to change a speakerâ€™s voice, so that sounds like the other oneâ€™s voice. This technology combines a variety of techniques in the speech signal processing field, such as the voice signal analysis, speech recognition, speech synthesis, speech enhancement and so on. In this paper, for the purpose of developing Chinese speech conversion system we use the HMM speech recognition and speech synthesis methods to study Chinese speech conversion technology.According to the characteristics of Chinese, we choice initials and finals as of the basic unit of speech recognition and voice synthesis. A complete speech conversion system is composed of three parts:the speech recognition, parameters conversion and speech synthesis. The main works in this paper as follows:1. It elaborates the framework of the voice conversion system and experimental data preparation, including the selection of the1000recording corpus collected under the premise of considering the consonants, vowels and syllables coverage, inviting four people to record a voice library, recording format conversion, voice proofreading, recognizes the speech in the speech database, and extracts the time information of consonants and vowel from the speech recognition.2. Manual proofreading and adjusting the speech recognition results, it produces the rhythm mark on the basis of the long statistical of initials to generate mono sub and Triphone training annotation files, designs for the training of HMM synthesizer context attributes and problem sets, and carries on the training of HMM synthesizer in HTS-2.0platform.3. By the above method, it brings about two speakerâ€™s HMM model, the marked files will be converted statement acoustic parameters generated by the two models, uses the interpolation method to generate the third person, also known as "virtual ".4. The generated "virtual" parameters through the STRAIGHT voice synthesizer generates speech waveform, conventional speech synthesis statement and the statement after parameter conversion will be evaluated by MOS and ABX.The naturalness of the speech synthesizer and the algorithm of voice parameters conversion is determinants of the transition effects. Experimental results show that:(1) In this paper, the synthesizer is average4.2in closed set, and3.9in open set, natural speech synthesis has basically reached an acceptable level.(2) using acoustic parameters interpolation to achieve the voice conversion, according to ABX subjective evaluation, the results show that the system can achieve the voice conversion function, we can control the converted voice more inclined to one of the two sources, and can be consolidated two source speakerâ€™s personality traits.

Keywords/Search Tags:

voice conversion, speech recognition, speech synthesis, HMM, parameterinterpolation

Related items

1	Design And Realization Of One-Shot Vehicular Voice-User Interface System
2	Research On Speech Recognition Using Voice Conversion Approach
3	Research On Embedded Speech Synthesis Technology
4	Research And Implementation Of Speech Synthesis Method For Helping Old Robots
5	Research On Detection Algorithm Of Speech Spoofing And Its System Implementation
6	Research Of Personalized Speech Generation
7	The Research Of Personalized Speech Synthesis Based On Generative Adversarial Network
8	Voice Conversion Based On AHOcoder And GMM Model
9	The Application Of HMM In Parameter-Based Text-To-Speech System
10	Speech synthesis algorithms for voice conversion