Research Of Speech Recognition Method Based On Audio-visual Information Fusion

Posted on:2012-08-03

Degree:Master

Type:Thesis

Country:China

Candidate:J Han

Full Text:PDF

GTID:2308330368478210

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In recent years, the speech recognition systems based on multi-modal information has become a research focus.The performance of speech recognition based on single-mode voice information is better under noise-free environment, however, in the case of noise or frequency interference, its recognition performance will be greatly reduced. Multi-modal information can enhance speech perception and comprehension, application of visual information can be effective against the environment noise. In order to improve the accuracy and robustness of speech recognition in noisy environment,this paper presents information fusion based on audio and visual speech recognition.This article establishes audio-visual information decision fusion model based on an overview of speech recognition in the relevant academic literature and dynamic,and with automatic speech recognition of noisy environment research background. First, this article describes the basic principles of information fusion, analyzes and compares three types of multi-source information fusion hierarchy and key information fusion method ,and study the basic algorithm of the hidden Markov model (HMM) in speech recognition and system architecture of speech recognition.Then, through analysis and comparison of audio-visual information fusion technology in the feature level fusion and decision level fusion methods advantages and disadvantages, the audio-visual information decision fusion model based on hidden Markov statistical model is proposed, which consists of two HMM processing sequence of audio-visual information observations, while this model in the audio mode or video mode can be recognized, so that they remain the intrinsic dependence each other.In addition, this paper uses a weighted fusion strategy to address the noise HMM training and test result does not match the problem.Finally,experimental results show that the audio-visual information decision fusion model is more performant than the model established on the basis of pure audio automatic speech recognition and pure visual automatic speech recognition and through comparative analysis of the existing anti-noise technology , this model can overcome the noise and improve the recognition accuracy.

Keywords/Search Tags:

automatic speech recognition, audio-visual information decision fusion, hidden Markov model, decision fusion

PDF Full Text Request

Related items

1	Basic Action Recognition Based On Deep Learning And Hidden Markov Model
2	A multimodal sensor fusion architecture for audio-visual speech recognition
3	Detecting And Processing Visual Information In Speech Synthesis System Driven By Visual-speech
4	Research On Multi�modal Emotional Recognition Based On Audio And Visual
5	Research And Application Of Audio-video Information Fusion Method
6	Research On Speech Emotion Recognition Based On Feature And Decision Fusion
7	Study And Improve On The Mongolian Speech Recognition System
8	Research On Noise Treatment Of Speech Recognition With Lip-movement Information
9	Research On Speech Emotion Recognition Based On Multimodal Information Fusion
10	Study On Cross-modal Speech Recognition Methods With Fusion Lipreading