Research Of Mandarin Text-Speech Alignment Based On SailAlign

Posted on:2016-01-04

Degree:Master

Type:Thesis

Country:China

Candidate:H K Gao

Full Text:PDF

GTID:2308330473455445

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Text-Speech Alignment, based on the technique of Automatic Speech Recognition, is a process of aligning the speech and text in time. Text-Speech alignment commonly finds applications in fields such as multimedia indexing and training of large vocabulary speech recognition and synthesis systems.Text-Speech alignment can facilitate large-scale research of rich spoken language resources that have recently become widely accessible, e.g., audio books, or multimedia documents.For this speech and text, the conventional Viterbi-based force-alignment be proven insufficient mainly due to mismatched audio and text and/or noisy audio.In order to make the speech and text to avoid these limitations, use the speech recognizer to recognize the speech to get the recognition result including the time information, and align it with the original text to get the common part, then use this part to get the correspond speech, it is possible to pose the Text-Speech alignment problem as a text-text alignment one.Solution of the latter is normally much less computationally demanding. SailAlign which is an open-source software toolkit based on this kind of speech and text alignment method.This paper using modified SailAlign algorithm research on chinese Text-Speech alignmentin the case of speech over text,we conducted the experiment, and the results were analyzed.The last we achieved automatic Text-Speech alignment.The contributions of this paper mainly include the following aspects:Firstly,because SailAlign algorithm does not support chinese,so we changed the SailAlign configuration file,add the chinese language model and acoustic model,so as to be able to use SailAlign research of chinese Text-Speech alignment. Acoustic model and language model is to use a lot of news broadcast voice and text to train.using SailAlign algorithm for voice and text Text-Speech alignment process is iterative and adaptive.Finally through the experiment of SailAlign algorithm under the condition of the voice than text for the accuracy of Text-Speech alignment are analyzed and compared. It has been demonstrated that the accuracy of this alignment can be relatively high under the condition of the voice than text.Next,after we use the SailAlign conduct Text-Speech alignment,then segment the alignment of the voice and text data,we can get one-to-one voice and text.In order to save time,improve efficiency,we put the SailAlign Text-Speech alignment the whole process of automation to implement in a shell script,the whole process is run in Linux this platform.the automatic process is divided into three modules,respectively for the pretreatment of the text and voice, SailAlign Text-Speech alignment,text extraction and voice segmentation.

Keywords/Search Tags:

Text-Speech Alignment, SailAlign, Speech Recognition, Language Model

PDF Full Text Request

Related items

1	Research On Automatic Speech-Text Alignment For Mongolian Long Audio
2	Research On Unannotated Long Chinese Speech Text-speech Alignment
3	Text-Speech Alignment Based On General Speech Recognition
4	Research Of Long Speech And Text Alignment
5	Researching Of The Mogolian Language Model Based On Speech Recognition
6	A Study On The Extraction Of Speech Depth In Tibetan Language And Its Speech Recognition
7	The Study And Application Of Text-to-Speech System
8	Research On Automatic Construction Of Speech Corpus And Speech Minimized Labeling
9	Application Research On Statistical Language Model Of Large Vocabulary Continuous Speech Recognition System
10	Development Of Dai Language Text-to-speech Conversion System