Font Size: a A A

Research On The Key Problems Of Speech Recognition Technology

Posted on:2015-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LvFull Text:PDF
GTID:2208330434451412Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The increasing globalization of the world trade has been expanding the range of personal circle of communication, whereas the problems of language have been seen stumbling blocks for this trend of human civilization. Fortunately, the great advancement in computer-assisted and other internet-related technologies have bridged the gap between the overwhelming process of globalization and people’s needs from communication. Therefore, there is growing concern about how to make complicated forms of communication easier.Speech Recognition, focusing on Speech signal and aiming at realizing the interaction between people and machine, includes Pattern Recognition, involving Physiology, Psychology, Linguistics, Computer Science, Signal Processing, and many other fields. In recent decades, Speech recognition has become an important bridge of communication to connect human and machines. Although the popularity, quality and efficiency of Speech Recognition technology have greatly improved,100%of recognition seems unrealizable now given the restriction of many factors such as physical problems, environment and speech recognition algorithms.This paper will analyze the internal and external reasons that affect the speech recognition, and reach solutions to improve the robustness of speech recognition. In the first part, this paper will analyze the personal reasons affecting the speech recognition accuracy in speech training, such as regional features, gender and physiological characteristics of speakers and the different emotional expressions.The second part mainly delves into the influences on the samples of speech signal from the external environment. Uncertain factors would disturb the process of signal sample collection, which might directly lead to incorrect results of speech signal training and recognition.The third part discusses the advantages and disadvantages in current popular algorithm and methods, in the preprocessing step, such as:pre-emphasis, windowed treatments, short-time average energy, short-time average magnitude functions and short-time zero crossing ratio. The methods of characteristic parameters are the Linear Prediction Coefficient (LPC), Linear Prediction Cestrum Coefficient (LPCC), Mel-Frequency Cepstral Coefficient (MFCC), and so on. Pattern recognition is also an important part in speech recognition technology, such as Dynamic Time Warping (DTW), Hidden Markov Models (HMM) and Vector Quantization (VQ). To sum up, this paper discusses effective ways to approach speech signals in noisy environment. Meanwhile, MFCC serves as the parameter, while VQ and HMM function as the recognition pattern.
Keywords/Search Tags:Speech Recognition, Signal Acquisition, Feature Extraction, PatternRecognition, Accuracy
PDF Full Text Request
Related items