Automatic speechreading for improved speech recognition and speaker verification

Posted on:2003-03-07

Degree:Ph.D

Type:Thesis

University:Georgia Institute of Technology

Candidate:Zhang, Xiaozheng

Full Text:PDF

GTID:2468390011989682

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

This thesis addresses two related problems in an automatic speechreading system, namely visual speech feature extraction and audio-visual integration. Two applications that exploit speechreading in a joint audio-visual speech signal processing task are developed: audio-visual speech recognition and biometric speaker verification.; A color-based visual feature extraction algorithm is proposed. The algorithm first reliably locates the mouth region by using color and motion information from a color video sequence of a speaker's frontal view. The algorithm subsequently segments the lip region by using a Markov random field framework and derives a relevant set of visual speech parameters. By enabling extraction of an expanded set of visual speech features, this visual front end achieves an increased accuracy in a speech recognition task when compared to previous approaches. Experimental results on a speaker verification task also demonstrate that the visual speech information is highly effective for reducing error rates over acoustic information alone.; A new audio-visual fusion model that uses a coupled hidden Markov model (CHMM) is also proposed. This model is able to capture the temporal correlations between the audio and visual information by allowing asynchrony between the two sources while preserving their temporal coupling. It is demonstrated that the CHMM performs better than other existing integration models. The performance benefit of using the visual modality is observed on both clean speech and under noisy conditions.

Keywords/Search Tags:

Speech, Visual, Speaker verification, Feature extraction

PDF Full Text Request

Related items

1	Speaker Extraction And Verification Based On Deep Learning
2	Research And Implementation Of Speaker Recognition Method For Anti-playback Fake Speech
3	Research On Speaker Verification System Based On Perceptual Log Area Ratio
4	Speaker Verification Under Emotional Voice And Implementation
5	A Study On The Countmeasures Of The Automatic Speaker Verification System Against Synthetic Speech
6	Content-independent Speaker Verification Modeland Its Application
7	Analysis Of Speaker Roles For Multi-speaker Conversational Speech
8	Research And Implementation Of Multi-Speaker Speech Synthesis System For Audio Novels
9	Research On Non-specific Speaker Speech Emotion Recognition Based On Deep Feature Extraction And Processing
10	Research On Improvement Of Speaker Recognition Algorithms Based On SONAR Platform