Font Size: a A A

Non-specific Isolated Word Speech Recognition Technology Research

Posted on:2015-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:L Y XueFull Text:PDF
GTID:2298330428967597Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The speech recognition technology has had a profound impact on human’s way of life and thus has always been the focus of scholars from all over the world. At present, the dynamic time warping technology and Hidden Markov model which based on the principle of probability and statistics are widely used in speech recognition, while using artificial neural network in speech recognition is a relatively new research methods proposed in recent years. Speech signal is a complex nonlinear process. Therefore, artificial neural network based on the nonlinear theory which has self-adaptability, parallelism, robustness, fault tolerance and learning characteristics, is becoming a new research direction. In this thesis, the author combine the BP network which is most used in the neural network with speech recognition. The major tasks and achievements are as follows:Firstly, the thesis analyses the basic principles of speech recognition from its linear generated model and system model and gives the voice’s entire preprocessing process including the original signal’s acquisition, pre-emphasis, framing, adding window and endpoint detection, and discuss the different methods of acquiring speech feature values, especially analyses the extraction of Mel Frequency Cepstral Coefficients and proposes a speech feature parameters based on wavelet transform-DWTC parameters.Secondly, the thesis introduces the dynamic time warping technology and Hidden Markov model and importantly analyses the three-layer feed-forward error back propagation network, and gives the derivation of its standard algorithm, then analyses the defects and deficiencies of the algorithm, and gives the improvements by adjusting the neuron’s transfer function on the basis of previous research. The specific method that has been derived afterwards, is adding a temperature coefficient and a location coefficient to the activation function which would make the network parameter has more information and faster convergence rate. Momentum factor and batch mode of training are also adopted in the improved BP algorithm, which has been proved to be effective by a simple function approximation experiment.Finally, the author designs a speech recognition simulation system based on BP neural network, using MATLAB, and then completes the training and recognition by own voice. The system uses a time warping algorithm, which is used to compress and combine the feature values for the need of backend BP neural network which asks for the input data having the same dimensions. Through experiments, we can get the following conclusions: the improvement learning algorithm is superior to traditional BP training algorithm on the recognition rate and speed of convergence; the numbers of BP network’s hidden layer neurons have a large impact on the system recognition rate and need to experiment to determine the best values; And validated DWTC parameters based on wavelet transform are better than MFCC parameters to characterize the speech signal.
Keywords/Search Tags:Speech Recognition, Wavelet Transform, Feature extraction, ANN, BP
PDF Full Text Request
Related items