Design And Implementation Of Speech Corpus

Posted on:2013-12-03

Degree:Master

Type:Thesis

Country:China

Candidate:F X Zou

Full Text:PDF

GTID:2248330371988845

Subject:Computer application technology

Abstract/Summary:

One of the ideal of human society is to allow the machine to master the ability of human speech about "listening" and "speaking". With the development of computer science and information society, this ideal is becoming a reality. The speech recognition technology is such a technology that tests whether the machine can understand what human beings are talking about or not and speech synthesis technology can allow the machine to obtain the speaking ability. Both of the speech synthesis research and speech recognition research, are all somewhat dependent on the excellent back-end speech corpus. This paper presents a speech corpus which mainly expands the source of the speech corpus, and improves the efficiency of speech recognition and synthesis system building work.Firstly, chapter1describes the state-of-art of speech synthesis and speech recognition, and then introduces the relationship between the speech corpus and the speech synthesis and speech recognition. Chapter2introduces speech synthesis and speech recognition requirements on the corpus design. Then, the corpus design goals are proposed. The basic speech unitâ€”triphone is determined as well. Besides, the original data are under the collection. Whatâ€™s more, the paper introduces the application of the greedy algorithm in corpus selection. Finally, through the greedy algorithm (based on the high-frequency word, triphone), selection of the original corpus is conducted and then the designed corpus text is obtained.Secondly, in chapter3, according to the actual demand, we arrange the modular design on. the acquisition system of speech. Then use the selected development tools and platform (.NET technology, using SQL Server2005database, C/S mode), we give a detailed design for each module in accordance with the design plan and analyze the required data object, the access program and the structure. Besides, we complete the speech corpus design and revise it in the development process until it is perfect. At the same time some record tests are conducted.Finally, chapter4introduces the popular English phonetic annotation system of ToBI and Mandarin speech tagging system C-ToBI in the world. And then according to our current actual situation, we divide the annotating of speech data into segmental annotation (text, with a tone pinyin, initials and finals) and prosodic annotation (prosodic boundaries). The final step is the implementation of the automatically generating the label file without the alignment, and manually aligning the label file with the wave file.

Keywords/Search Tags:

corpus, greedy algorithm, corpus design, speech recording, speech annotation

Related items

1	Auto-constructing Speech Corpus With The Limited Text~2
2	Studies On Techniques For Chinese Speech Simulation System Based On Corpus
3	Research And Implementation Of Uyghur Speech Corpus Management Platform
4	The Establishment And Application Of Uyghur Speech Corpus Based On Online
5	Research On The Construction Method Of Burmese Part-of-speech Tagging Corpus
6	Corpus Supported English Text To Speech Synthesis Engine
7	The Research And Realization Of Corpus Based Speech Synthesis System For Uyghur
8	Research On Active Learning Based Automatic Corpus Annotation
9	Research On Several Key Technologies In Cross-corpus Speech Emotion Recognition
10	Speech Emotion Recognition Based On Deep Separable Convolution And Cross Corpus