Font Size: a A A

Research And Application Of Audio-video Information Fusion Method

Posted on:2015-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:C C SongFull Text:PDF
GTID:2298330452494287Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous development of the information society,a more friendly and morenatural,more intelligent human-computer interaction technology gradually become soughtgoal. As an important research focus of human-computer interaction technology, speechrecognition technology has gradually penetrated into every aspect of people’s live.However,traditional audio single-channel technology can no longer meet the complex daily needs,anew technology based on visual information combining with auditory information isincreasingly cause for concern.A high-performance speech recognition system depends mainly on audio and videofeature extraction and integration model.Based on the analysis of human auditorycharacteristics and the real-time requirements, in the feature extraction, the paper extractedaudiovisual features, wherein the audio feature uses a representative MFCC parameters inthe field of speech recognition, the video feature uses the lip contour features which cancharacterized the speech information effectively.When establishing the audiovisual fusionmodel, through improving and perfecting the existing two-process coupled hidden Markovmodel,established based on initialization, revaluation and identification three-processcoupled hidden Markov models. And in the recognition process,uses the adaptive weightsselection method to determine the optimum weights of the two channels under differentSNR,uses audio-visual features and the model in speech recognition.Using the above method,the paper achieves a higher recognition rate by testing on theown specific database and the database of the Advanced Multimedia Processing (AMP)Lab of Cornell University.Experimental results show that the speech recognition based ondual-channel significantly improved the recognition efficiency compared to thesingle-channel,especially in a complex environment, audiovisual features can effectivelycompensate for noise complementary information on the single-channel interference;adaptive weights coupled hidden Markov model has a better applicability,some theoreticaland practical value.
Keywords/Search Tags:visual feature, auditory feature, hidden Markov model, coupledhidden Markov model, Adaptive
PDF Full Text Request
Related items