Font Size: a A A

Research Of Speech Recognition Method Based On Audio-visual Information Fusion

Posted on:2012-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:J HanFull Text:PDF
GTID:2308330368478210Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, the speech recognition systems based on multi-modal information has become a research focus.The performance of speech recognition based on single-mode voice information is better under noise-free environment, however, in the case of noise or frequency interference, its recognition performance will be greatly reduced. Multi-modal information can enhance speech perception and comprehension, application of visual information can be effective against the environment noise. In order to improve the accuracy and robustness of speech recognition in noisy environment,this paper presents information fusion based on audio and visual speech recognition.This article establishes audio-visual information decision fusion model based on an overview of speech recognition in the relevant academic literature and dynamic,and with automatic speech recognition of noisy environment research background. First, this article describes the basic principles of information fusion, analyzes and compares three types of multi-source information fusion hierarchy and key information fusion method ,and study the basic algorithm of the hidden Markov model (HMM) in speech recognition and system architecture of speech recognition.Then, through analysis and comparison of audio-visual information fusion technology in the feature level fusion and decision level fusion methods advantages and disadvantages, the audio-visual information decision fusion model based on hidden Markov statistical model is proposed, which consists of two HMM processing sequence of audio-visual information observations, while this model in the audio mode or video mode can be recognized, so that they remain the intrinsic dependence each other.In addition, this paper uses a weighted fusion strategy to address the noise HMM training and test result does not match the problem.Finally,experimental results show that the audio-visual information decision fusion model is more performant than the model established on the basis of pure audio automatic speech recognition and pure visual automatic speech recognition and through comparative analysis of the existing anti-noise technology , this model can overcome the noise and improve the recognition accuracy.
Keywords/Search Tags:automatic speech recognition, audio-visual information decision fusion, hidden Markov model, decision fusion
PDF Full Text Request
Related items