Font Size: a A A

Design And Implementation Of Speech Corpus

Posted on:2013-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:F X ZouFull Text:PDF
GTID:2248330371988845Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
One of the ideal of human society is to allow the machine to master the ability of human speech about "listening" and "speaking". With the development of computer science and information society, this ideal is becoming a reality. The speech recognition technology is such a technology that tests whether the machine can understand what human beings are talking about or not and speech synthesis technology can allow the machine to obtain the speaking ability. Both of the speech synthesis research and speech recognition research, are all somewhat dependent on the excellent back-end speech corpus. This paper presents a speech corpus which mainly expands the source of the speech corpus, and improves the efficiency of speech recognition and synthesis system building work.Firstly, chapter1describes the state-of-art of speech synthesis and speech recognition, and then introduces the relationship between the speech corpus and the speech synthesis and speech recognition. Chapter2introduces speech synthesis and speech recognition requirements on the corpus design. Then, the corpus design goals are proposed. The basic speech unit—triphone is determined as well. Besides, the original data are under the collection. What’s more, the paper introduces the application of the greedy algorithm in corpus selection. Finally, through the greedy algorithm (based on the high-frequency word, triphone), selection of the original corpus is conducted and then the designed corpus text is obtained.Secondly, in chapter3, according to the actual demand, we arrange the modular design on. the acquisition system of speech. Then use the selected development tools and platform (.NET technology, using SQL Server2005database, C/S mode), we give a detailed design for each module in accordance with the design plan and analyze the required data object, the access program and the structure. Besides, we complete the speech corpus design and revise it in the development process until it is perfect. At the same time some record tests are conducted.Finally, chapter4introduces the popular English phonetic annotation system of ToBI and Mandarin speech tagging system C-ToBI in the world. And then according to our current actual situation, we divide the annotating of speech data into segmental annotation (text, with a tone pinyin, initials and finals) and prosodic annotation (prosodic boundaries). The final step is the implementation of the automatically generating the label file without the alignment, and manually aligning the label file with the wave file.
Keywords/Search Tags:corpus, greedy algorithm, corpus design, speech recording, speech annotation
PDF Full Text Request
Related items