Font Size: a A A

Lip-reading Based On Hidden Markov Model

Posted on:2020-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:S M ChenFull Text:PDF
GTID:2428330578452097Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
The amazing development speed of technology has accelerated the birth to newer smart devices.People who are not satisfied with the traditional way of interaction have generated the demand for new human-computer interaction technology.Lip-reading technology is a hotspot of a new type of human-computer interaction technology.Lip-reading has great values in many scenarios,such as improving the accuracy of speech recognition in high-noise environments,helping language communication barriers and ensuring public safety.Traditional lip-reading refers to inferring the content of a speech by observing the move in the lips of the speaker during the pronunciation process.Lip-reading of a computer refers to the classification and recognition of image sequences by establishing a lip-reading model and analysing lip motion parameters.The basic steps can be divided into facial image acquisition,lip positioning,feature extraction and lip-reading.However,as an emerging technology,lip research can be performed using various research methods in other fields,but there are disadvantages of low accuracy and other limitations.Therefore,the research on lip-reading is mostly in the research stage and it is difficult to be applied.In this thesis,the self-build Chinese lip-reading database established the Hidden Markov Model(HMM)is used.The main work and innovations of this paper are as follows:(1)This paper studies the recognition of single-word pronunciation and proposes a self-established lip-reading database.The database contains 20 Chinese characters,10 people each pronounced 6 times and a total of 1200 videos.The image sequence is extracted from video to provide data onto subsequent facial extraction and lip feature extraction.The specific parameters and requirements of the database are shown in the text.(2)For the problem that the number of Haar-Like features is large and the weak classifier leads to long training time,this paper proposes the YCbC,threshold to improve the AdaBoost-based face recognition algorithm in face detection.This algorithm discriminates the CbC,component in the YCbC,color space for the yellow skin color,and roughly divides the face region and then use the AdaBoost algorithm for detection.The experimental results show that using the YCbC,threshold to improve AdaBoost-based face recognition can get the face region faster without losing accuracy.(3)The facial feature points are obtained by Constrained Local Model(CLM),and the mouth position and feature points.The mouth width,inner lip height and outer lip height are obtained by six main feature points.Tracking the characteristic changes of the image sequence and establishing Hidden Markov Model with discrete outputs(DHMM)for lip-reading.In many experiments and analysis of mouth-type pronunciation,the optimal initialization DHMM model parameters for this database were obtained.The average recognition rate of lip-reading using DHMM in the self-built small database reached 59%,indicating that DHMM can be used for lip-reading recognition of Chinese.(4)This paper explores the lip-reading of the profile face.The compensated lip features were obtained by stretching from the image,and a 47.5%accuracy was obtained using the previous DHMM lip-reading system.This shows that the lip features after stretching can still be identified,indicating the possibility of side lip-reading.
Keywords/Search Tags:Lip-reading, face recognition, feature extraction, HMM, profile face
PDF Full Text Request
Related items