Auto-constructing Speech Corpus With The Limited Text~2

Posted on:2011-07-29

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Liu

Full Text:PDF

GTID:2178330332463805

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Speech synthesis technology is one of more important technology in human-computer interaction research techniques, and has been applied to many areas of social life.With the development of speech synthesis technology, speech corpus Construction technology has also been an important research module, and has got the researchers of concern, and quickly constructing speech synthesis corpus has a great significance. At present, the mature technology to solve this problem is the traditional method of constructing corpus.In traditional speech synthesis System, first of all it is to collect lots of text material, then to select the text from the collected text for the work of text recording and labeling. For English Speech Synthesis System, the resercher's pronunciation impacts the naturalness and expression of synthesized speech.Since there are non-mother tongue tribals and no-professional recording equipment and quiet recording environment in the laboratory, it is not able to achieve speech synthesis requirements.Taking into account the existence of plenty of electronic text and its corresponding speech recording (MP3 format) of in the network we can realize a new speech synthesis system.Our work is one of speech synthesis system components-the automatic constructing speech corpus with the limited text.firstly the text and its corresponding speech recording is automatically downloaded from the language teaching network and we also download the text material and its recording based on a specific woman. We select the Voice of America website www.51voa.com. Music audio exists in the recording files, so we need to achieve speech and music audio classification. The usual methods to speech synthesis corpus construction are to use the sentences or phrases, so it is necessary to segment the text to sentences.After audio classification and sentence segmentation, which are achieved by other students.wo select the accurate text for the corresponding recording as the initial text to constructing the speech corpus.There must be the speech recordings for the text, so it is diffient from the traditional method. We named it the corpus construction with the limited text.The method of Corpus construction is introduced and studied, the greedy algorithm to solve this problem of text selection, we apply the greedy algorithm for text selection, combinating the information retrieval methods, using open source softwares speech recognition Tools HTK and speech synthesis tools Festival to achieve Corpus building process. Finally, we apply the HTK tools, Festival tools and greedy algorithm to construct the speech corpus.And from the results of this mothed, the best diphone coverage rate is 93.52%.It is improved this mothed is a good one. This major work and innovations are:(1)This is one part of a new speech synthesis system, has the new idea.(2) In this paper, the information retrieval method has combined the greedy algorithm to achieve the construction of speech corpus.This shows that the information retrieval conbined to the greedy algorithm for the automatic corpus construction can provide a significant coverage of speech unit; it is can achieve the automatic corpus construction.

Keywords/Search Tags:

speech synthesis, corpus, greedy algorithm, information retrieval

PDF Full Text Request

Related items

1	Design And Implementation Of Speech Corpus
2	Studies On Techniques For Chinese Speech Simulation System Based On Corpus
3	Corpus Supported English Text To Speech Synthesis Engine
4	The Research And Realization Of Corpus Based Speech Synthesis System For Uyghur
5	Robust Speech Synthesis Based On Small Amount Of Corpus
6	Research On Statistical Parametric Emotional Speech Synthesis
7	Create An Emotional Speech Synthesis Corpus
8	Indonsian Text Analysis And Processing For Speech Synthesis
9	Recognition Of Prosodic Phrases Based On An Unlabeled Corpus And "Adhesion" Culling Strategy
10	Research On Automatic Construction Of Speech Corpus And Speech Minimized Labeling