Font Size: a A A

Research Of Mandarin Text-Speech Alignment Based On SailAlign

Posted on:2016-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:H K GaoFull Text:PDF
GTID:2308330473455445Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Text-Speech Alignment, based on the technique of Automatic Speech Recognition, is a process of aligning the speech and text in time. Text-Speech alignment commonly finds applications in fields such as multimedia indexing and training of large vocabulary speech recognition and synthesis systems.Text-Speech alignment can facilitate large-scale research of rich spoken language resources that have recently become widely accessible, e.g., audio books, or multimedia documents.For this speech and text, the conventional Viterbi-based force-alignment be proven insufficient mainly due to mismatched audio and text and/or noisy audio.In order to make the speech and text to avoid these limitations, use the speech recognizer to recognize the speech to get the recognition result including the time information, and align it with the original text to get the common part, then use this part to get the correspond speech, it is possible to pose the Text-Speech alignment problem as a text-text alignment one.Solution of the latter is normally much less computationally demanding. SailAlign which is an open-source software toolkit based on this kind of speech and text alignment method.This paper using modified SailAlign algorithm research on chinese Text-Speech alignmentin the case of speech over text,we conducted the experiment, and the results were analyzed.The last we achieved automatic Text-Speech alignment.The contributions of this paper mainly include the following aspects:Firstly,because SailAlign algorithm does not support chinese,so we changed the SailAlign configuration file,add the chinese language model and acoustic model,so as to be able to use SailAlign research of chinese Text-Speech alignment. Acoustic model and language model is to use a lot of news broadcast voice and text to train.using SailAlign algorithm for voice and text Text-Speech alignment process is iterative and adaptive.Finally through the experiment of SailAlign algorithm under the condition of the voice than text for the accuracy of Text-Speech alignment are analyzed and compared. It has been demonstrated that the accuracy of this alignment can be relatively high under the condition of the voice than text.Next,after we use the SailAlign conduct Text-Speech alignment,then segment the alignment of the voice and text data,we can get one-to-one voice and text.In order to save time,improve efficiency,we put the SailAlign Text-Speech alignment the whole process of automation to implement in a shell script,the whole process is run in Linux this platform.the automatic process is divided into three modules,respectively for the pretreatment of the text and voice, SailAlign Text-Speech alignment,text extraction and voice segmentation.
Keywords/Search Tags:Text-Speech Alignment, SailAlign, Speech Recognition, Language Model
PDF Full Text Request
Related items