| Speech recognition system based on data driving represents the highest level of the currenttechnology which achieved great performance through the full training, but the system describesspeech signal only on the aspect of statistical knowledge, and can’t integrate different types andlevels speech production knowledge effectively, which causes that the performance improvementof system is slow and expensive. Under this background, the system framework adding speechproduction knowledge (speech recognition system based on speech events detection) is putforward. How to extract speech production knowledge and how to use it becomes the keyproblem. This paper focuses on the methods of distinctive features detection which representspeech production knowledge and phone recognition based on distinctive features. The work inthis dissertation is summarized as follows:This paper proposes a phone boundary detection method based on spectrum. Firstly, theregion of spectrum is divided into three areas according to the structure of the spectrum whenphones pronounce. And potential boundaries of every region are detected through the Euclideandistance between the energy vectors of adjacent frames. Then, false phone boundaries areremoved by examination. Finally, phone boundaries are obtained via the fusion of the boundariesof three regions. Experimental results show that the boundary detection algorithm has higheraccuracy and precision, while the complexity is lower.The extraction of related distinctive features is completed by different methods, whichincludes landmarks detection, fricatives detection and phonological attributes detection. Amongthem, landmarks are obtained by detection of the curve peaks of the first different energy. Due tothat the sequences of landmarks doesn’t contain information about fricatives, a fricative detectionmethod is presented according to the spectrum characters of pronunciation of fricatives. Whilephonological attributes are detected by Recurrency and Time-delayed Neural Networks addingthe long-term information.In current event-detection based speech recognition system, the asynchronous problembetween distinctive features impairs the effect of features fusion and decoding. Aiming at theseproblems, a phone recognition method based on distinctive features is proposed in this paper.Firstly, frame-based phonological attributes are converted to segment-based features using thephone boundaries information. Secondly, the candidates of phoneme are obtained by fuzzysearching and matching of segment features according to the mapping table offeatures-to-phoneme. Finally, the insertion errors are removed by verification via landmarks.Compared with the system based on data-driving, this method has great advantages on aspectssuch as recognition speed, expansibility, and so on. |