Font Size: a A A

Research On The Normalization Of Spoken Language In Speech-to-speech Translation

Posted on:2016-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:S Z WuFull Text:PDF
GTID:2308330479490084Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the global economic, the desire for international information is more and more intense. However, language has become the bottleneck for people with different mother tongue to exchange information. Therefore, speech-to-speech translation emerges and attracts a lot of researchers from different fields. Speech-to-speech translation consists of three parts, which are automatic speech recognition(ASR), machine translation(MT) and text-tospeech synthesizer(TTS). The ASR outputs are the bonds, which combines the three parts as a whole system. However, ASR outputs often contain various of disfluencies. It is necessary to detect and remove these disfluencies before processing downstream tasks.In this paper, an efficient disfluency detection approach based on right-toleft transition-based parsing is proposed, which can efficiently identify disfluencies and keep ASR outputs grammatical. Our method exploits a global view to capture long-range dependencies for disfluency detection by integrating a rich set of syntactic and disfluency features with linear complexity. We also study several different methods which tackle disfluencies from different views. A set of dependency structure features is presented in the CRF model, which significantly improve the performance on disfluency detection. The experimental results show that our proposed method outperforms state-of-the-art work and achieves a 85.1% f-score on the commonly used English Switchboard test set. We also apply our method to in-house annotated Chinese data and achieve a significantly higher f-score compared to the baseline of CRF-based approach.
Keywords/Search Tags:speech-to-speech translation, disfluency detection, dependency parsing, conditional random field, max-margin markov network
PDF Full Text Request
Related items