Font Size: a A A

People Independent Chinese Speech Recognition Based On HMM And ANN

Posted on:2013-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:W J XiaoFull Text:PDF
GTID:2248330395467789Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development of the modern computer science, the Man-Machine Interface has not limited in the keyboard and mouse. More and more new communication technologies have been applied in the new computers generation, meanwhile the progress of the digital speech processing and speech recognition technology make speech a new effective way of input. To use the speech to communicate with machines and let machines understand what you said, it’s a coveted thing for man. Speech recognition technology is an advanced technology which let the machines tans late speech signals into relative text or command through recognition and understanding process.Speech recognition technology includes many scientific fields such as the acoustics, the linguistics, digital signal processing, computer science, artificial neural network and so on. The characteristics of speech recognition brings many difficulties to this technology. The process of computer speech recognition is almost an imitation of the speech recognition of human. Current main technology of speech recognition is based on the theory of statistic pattern recognition.This article introduce the basic conception, the common method and characteristics of isolated word recognition system. And analyze the extraction of LPCC and MFCC from the speech signal at time-domain and frequency-domain. Through analyzing the influence of endpoint detection, and combine the method of improving robustness introduce dynamic window size.Meanwhile, analyzing the basic theory of from three questions (evaluation question, decode question, training question) and the application for speech recognition.The speech spectral features are very important parameter in ASR,which symbolizes the property of HAS(human auditory system).LPCC and MFCC are two most popular features used in speaker identification. LPC-cestrum and MFCC-cestrum are successfully used in many recognizers.This paper investigates different components of LPC-cestrum and MFCC-cestrum parameters devoted to discriminative features of speakers based on the statistical analysis of a variety of text-independent speech databases from several speakers. The statistical features of speaker’s LPC cestrum are studied from various points of view, such as fisher-ratio, Nearest-neighbor error, superball error and ratio of the between-class and within-class scatter measure and its change is examined under orthogonal transform or max-discriminative transform. The conclusions result in further understanding of an internal cause of LPC-cestrum used in speaker recognition and will be of great value to the effective selection of speaker discriminative features, a significant reduction of feature dimensionality and the improvement of the recognizer performance. Neural network system is a self-learning, adaptive system,and it is eas-y to associate, synthesize and generalize with its properties of fault to learn and robustness. So,it is available to process the pattern information, which is hard to describe with language.This article finally realize a small, isolated word speech recognition system. This system realize the extraction of feature parameter, the training of speech model parameter and recognition of the recorded speech. This article use MFCC as feature parameter, the HMM model used for speech model. And introduce BP neural network to system for the second recognition, the HMM is applied as the front-end to process the time sequence of speech and the primary recognition information is provided in this step.In the next step, BPNN is applied as the back end and because of its superior functions of pattern classification and generalization, the primary recognition information is non-linearly mapped into the secondary recognition information. The final recognition procedure is accomplished with the two kinds of recognition information. Experiments prove that using this robust model, recognition rate can be noticeably improved in noisy enviroment. Finally,this article studies the Uighur speech recognition and realize. To unite HMM and ANN in speech recognition, it increases the rate of people independent Chinese speech recognition, points out that hybrid networks have an advantageous position in speech recognition.
Keywords/Search Tags:Speech recognition, Hidden Markov models, Neural networks, Speech feature parameter, Hybrid networks
PDF Full Text Request
Related items