Font Size: a A A

Research On Chinese Lip Reading Algorithm Based On Viseme Phoneme Mapping

Posted on:2022-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y R ChenFull Text:PDF
GTID:2518306341453724Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Lip reading refers to the recognition of what the speaker is saying based on the lip movements of the speaker.The research on lip reading is of great significance to many fields such as security monitoring and judicial trial.Due to the rapid development of computer technology and the limitations of artificial lip reading,the research of machine lip reading has received greater attention and has made certain progress.However,most of the current research on machine lip reading directly recognizes the corresponding speech content based on the lip movement information.This one-step lip reading research needs to model the words,and the number of words is very large.Taking Chinese as an example,the number of Chinese characters is as many as tens of thousands,which requires the corpus of the dataset to cover tens of thousands of Chinese characters,and the sample size of the dataset is rich enough for the model to fully learn the lip characteristics of humans and recognize the corresponding text.But,such a large-scale lip reading dataset is currently lacking.On the other hand,the interpretability of the lip reading model built with this thought is also relatively poor.When the model is not effective,it is difficult for researchers to analyze where the problem is.Therefore,this paper considers to study lip reading from the visual basic unit of speech-"Viseme".Since the number of visemes is far less than the number of words,if visemes are first identified in the process of lip reading,the problem of the lack of large-scale lip reading datasets can be overcome.Since the viseme is related to the "phoneme"-the acoustic basic unit of speech,and the phoneme can be used as an intermediate unit of viseme mapping to the text,the concept of phoneme is also introduced in this research.This paper proposes a new research idea of lip reading based on viseme phoneme mapping for Chinese Mandarin,and designs a modular,locally adjustable lip reading system with strong interpretability.The research content and innovations of this paper are as follows:(1)This paper first proposes the research idea of Chinese Mandarin lip reading based on viseme phoneme mapping,and designs the corresponding modular lip reading system.This paper has conducted an in-depth study of each module in the system,and used a variety of algorithms to design different models for each part of the system and conducted experimental verification.(2)After consulting authoritative linguistics monographs,this paper sorts out the definitions and explanations of terms such as "vise","visme","phoneme" and "phoneme" that are easily confused in related research.(3)This article comprehensively analyzes the different viseme-phoneme mappings in previous studies,and concludes a new viseme-phoneme mapping for Chinese Mandarin.(4)Aiming at the problem of the lack of datasets for viseme research,this paper designs and produces a dataset suitable for Chinese Mandarin viseme research and lip reading research based on Chinese Mandarin viseme.(5)After analyzing and comparing the viseme features in previous studies,this paper proposes a new set of viseme features and applies them in this research.
Keywords/Search Tags:Lip reading, Viseme, Phoneme, Mandarin
PDF Full Text Request
Related items