Font Size: a A A

Extraction And Application Of Uyghur Phoneme Phonetic Features Based On Image Processing

Posted on:2017-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2308330503984345Subject:Engineering, computer application technology
Abstract/Summary:PDF Full Text Request
Speech recognition has been a breakthrough and a wide range of application, the new demand continuously along with the development of speech recognition, first acoustic parameters is related with the natural attributes of speakers, then acoustic parameters is not accurate to describe phonetic features. At the same time, language recognition, speaker recognition and speech visualization and auto mark in speech recognition, still need more work.Uyghur language consists of 32 phonemes, and phoneme is the basic speech unit,identifying phonemes is an important foundation to realize the continuous speech recognition, this paper analysis the spectral characteristics of speech spectrograms,improving and using image processing algorithms to extract features of voice signals,fuzzy pattern and intelligent calculation are both applied to recognize the similarity of Mono phoneme; according to phonetic pronunciation, phonemes are classified. The specific work of this paper is as follows:1) The features based on spectrograms, in many years ago has received the attention, firstly, compared with previous related work, try to explore new ideas and methods for speech recognition, without considering the pronunciation of the natural attributes, without considering the commonly used acoustic phonetic feature parameter, the application uses observable image features to realize the speech which is the difference in the method innovation with the previous work, feature extraction process is unique, for the development of speech recognition provides new ideas.2) The use of Cepstrum analysis portrayed Uyghur voice spectrogram, describes the different phonetic pronunciation of the language spectrum of features in the image reflected.3) Application of Optimal Iterative threshold binary method to transformphonemes spectrogram, for enhancing features and filtering out redundant information in the image; in order to improve the description of binary features extracted from spectrogram, boundary feature and image binary feature are fused by using wavelet transform. In order to reduce the computational complexity of identification and classification, this paper uses wavelet transform to reduce dimension of feature matrix, low-dimension feature matrix is preconditioned as feature vector.4) By mathematical morphology analysis of the shape feature of the characteristic matrix of the Uyghur phoneme, image dilation operation determines the characteristics of the core point coverage as the probability distribution function,based on fuzzy theory using fuzzy pattern recognition to calculate degree of similarity of phoneme’s probability model; In the experiment, the Mono phone correct recognition rate improved to 77.5%; phoneme recognition in a continuous flow of speech, division of phonemes process is introduced, the sentences that contains 20-30 phonemes, recognition rate is about 50 phonemes/min, the loss rate is about 5%, the correct recognition rate reached 62%.5) Constructs BP neural network, Uyghur phonemes are categorized by phonetic pronunciation characteristics. The result of experiments is demonstrated by using confusion matrix, the accurate classification result is about 70%.6) Establishes a graphical user interface, integrated an execute desktop application for easy operation, including phoneme recognition and classification,improving the operation effect of the system, configure the development environment,make full use of computer hardware resource, improve the system efficiency.
Keywords/Search Tags:Uyghur, phonemes, spectrogram, image processing, fuzzy pattern recognition, BP neural network
PDF Full Text Request
Related items