Font Size: a A A

Mandarin Syllable Recognition System Based On Asat Frame

Posted on:2015-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:X WuFull Text:PDF
GTID:2298330467962189Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Speech recognition technology is way important in natural human-computer interaction. While statistical speech recognition research, which is based on data-driven mode, has made great progress, the robustness and adaptability of speech recognition system is still not ideal. Pure data-driven model is no longer meeting the needs of speech recognition system and the speech recognition research on both data-driven and knowledge-based model become an important tendency.Automatic speech attribute transcription is a new type of bottom-up speech recognition framework, which is to develop a detection-based approach to speech recognition based on attribute detection and knowledge integration. Research on the deep-neural-network-based ASAT framework has made significant progress in English speech recognition. However, there is not related work in the Chinese speech recognition.In this paper, ASAT framework is used in Mandarin syllable recognition task. Based on the framework of ASAT, we employ deep neural network to construct the Mandarin syllable recognition system.The main research work includes the following aspects:1. A set of Mandarin speech attributes is designed. According to the knowledge of speech and language science, we design Mandarin speech attributes includes phone attributes and initial/finial attributes.2. This paper proposes a bottom-up detection-based paradigm for Mandarin syllable recognition system. With the framework of ASAT, we design a three-level detector to recognize syllable, including, from bottom to up, attribute, phone/initial/final and syllable level.3. In comparison experiment, we identify the influence factors of the detector, including different deep neural network parameters and algorithm. Meanwhile, we analyze the feasibility of ASAT-based Mandarin syllable recognition system.Experimental result shows that deep neural network has strong modeling ability. On this task, all levels of detector have good recognition performance. Attribute level has an excellent average accuracy of92%. Meanwhile, syllable level achieves an accuracy of86%. However, the cascade detection leads to error accumulation which is our future work.
Keywords/Search Tags:automatic speech attribute transcription, speech attribute, deep neural network, ASAT, syllable recognition
PDF Full Text Request
Related items