Font Size: a A A

Chinese Speech Recognition Based On HMM And ANN

Posted on:2006-12-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:L W ChenFull Text:PDF
GTID:1118360155468799Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech recognition is a meaningful, application extensive technology, its practicability and interest make people having an urgent application demand to it. It plays important roles in official working or the database voice inquiry of the business system, industrial voice control, autodialing of telephone and telecom system, medical treatment and hygiene, and will become the interface of the operating system and application program of next generation.Though speech recognition had acquired enormous achievement already, formed the HMM, VQ, DTW technology, etc. Some more successful speech recognition systems had appeared, but some problems exist when facing practical application. For example: the adaptability of the speech recognition system is bad, the dependence to the environment is strong. Speech recognition system setting up under a certain environment can only be used under this kind of environment; otherwise systematic performance will drop suddenly. There are thousands of languages in the world, each language has many kinds of dialects, in this way, and with the change of the environment, the performance of recognition system will inevitability be reduced. This text concentrating on the subject questions in the practical system of Chinese speech recognition, aimed at improving the recognition rate and the resistance ability to noise, investigate the practical theory and key technology of Chinese speech recognition. This text verifies the validity of the methods with a large number of experiments and data. The main research works and achievements are as follows:1. Introduce the basic conceptions and principles involving in the speech recognition system; analyze the structure of the general speech recognition system and various kinds of theory and technology being used, including the selection of speech recognition units, the technology of the characteristic parameter extracting, template matching and technology of training of model ,etc. In addition, the development histories, research current situation, classify ways and problemsfacing at present are expatiated.2. Discuss the methods of extracting the different main characteristic parameters of speech recognition systematically, especially analyze LPCC and MFCC parameters which reflect cepstrum characteristic, and in order to reflect the dynamic performance of the characteristic parameters, this text also proposes LPCC MFCC one steps, two steps difference parameter; this text also study LSF parameter and its fast computing algorithm, the necessary memory space of fast algorithm is smaller, various kinds of algebraic operation are less in number of times, the software realization is simpler.3. Aiming at the speech recognition problems under the noise environment, this text propose a method that combining CDHMM with SOFM neural network together and form a new CDHMM/SOFM hybrid model. Experiment results show under lower SNR situation, compared with the traditional CDHMM and CDHMM _N models, the recognition rate of hybrid model is increased distinctly. Next propose a speech classify model based on fuzzy neural network, this model being used in speech recognition with different SNR data, this network not only has low-grade learning ability but also has advanced reasoning ability of fuzzy logic system. Experiment results show, using this model the recognition rate is higher than using the ordinary neural network model and the noise resistance is stronger. The CDHMM/SOFM hybrid model and fuzzy neural network model being compared, experiment results show, in the same condition, the recognition rate of the later is higher than the former.4. Propose two kinds of speaker recognition methods based on neural network. One is based on SOFM-PNN, on the basis of SOFM cluster, use PNN to carry on the probability classification. Experiment results show, this hybrid neural network is a high performance, high efficiency speaker recognition system. The other is GAVQ model, and then use genetic neural network (GA-RBF) in speaker recognition. Experiment results show GAVQ can extract and represent individual character message of speaker, the recognition rate of GA-RBF neural network is more excellent, this model can dispel the influence of initial parameters, and obtain higher correct recognition rate. The SOFM-PNN hybrid model and GA-RBF model being compared, experiment results show, in the same condition,the recognition rate of the former is higher than the later.5. Aiming at the defects of the traditional homogeneous HMM, on the basis of thorough research of the homogeneous HMM, propose a new inhomogeneous HMM (MBHMM) and establish the corresponding inhomogeneous HMM, this model being used in Chinese speech recognition. Theory analysis and experiment results show with inhomogeneous HMM (MBHMM) in isolated word recognition, the recognition rate is higher than using homogeneous HMM.
Keywords/Search Tags:Speech recognition, characteristic extraction, Neural network, Genetic algorithm, HMM
PDF Full Text Request
Related items