Font Size: a A A

Study On Lip-reading Recognition

Posted on:2013-02-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z L ZhangFull Text:PDF
GTID:1228330395959630Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As an important component of the future Human-Computer Interface (HCI),Automatic Speech Recognition (ASR) is designed for the purpose of realizingidentification recognition and natural language comprehension by means of humanvoice. Speech recognition technology has acquired significant achievements withsome successful popularity and applications. IBM’s ViaVoice system, for instance, hasgood performances when the vocabulary pool is small and when the noise is low. Butits performance will be greatly degraded when used in real application environments.In future applications of the human-computer interaction, such as in a car, at airport,or live interviews, higher requirements for robust systems will be needed, thereforewe need to explore new ways. Proved highly effective by most researchers, thecombination of visual features of lip motion with vocal features can raise therecognition rate of the automatic speech system, and make it more robust and moreadaptable to real environments. This article aims qt improving the effectiveness of lipfeature extraction and the recognition rate. The main works and innovative points areas follows:(1) We present an algorithm of lip feature extraction based on MPEG-4parameter. The choice of lip characteristics works crucially on lip-reading recognition.We have chosen24feature parameters which is associated closely with lip-reading inMPEG-4, and then we described lip characteristics by these parameters. In order toseparate lip area from the other areas facial, we described the colors of lip area by6GMM. To describe more correctly the shapes of lips and the tracking of the contoursof lips, we created the new function to search energy based on6GMM and theinformation related to lip contour, and used in the deformation of the template. Weobtained the GMM parameters of the lip area and other facial areas by the maximumlikelihood algorithm. We can effectively distinguish between the lip area and otherfacial areas, finding contour distribution of ROI (region of interest) too. With thepurpose of removing the impact of the overall movement of the face to the tracking ofthe lip area, we used the four characteristic points on the face to correct the posturesof the facial movements, by estimating the process of facial movements. Finally, we obtained the parameters of FAP (facial animation parameter) based on the of facialfeature points.(2) We present a lip classification method based on Fourier descriptors. Afterobtaining the location and size of the lips by AdaBoost algorithm, we firstly locate thelip edge by the edge detection method. Secondly, the shape of the lip identifiesimportant Eigen values by Fourier descriptors. Finally, we input Fourier descriptorswhich are converted after normalization process to the artificial neural network toclassify. The experiment proved that the lip classification accuracy rate can reach85%by using Fourier descriptors.(3) We present a lip-reading recognition method and the establishment of alip-reading recognition system based on HMM. In recent years, HMM has graduallybeen applied to the study of lip-reading recognition with its own advantages. Due tothe limitations of the traditional HMM, the rate of lip-reading recognition is not goodenough. According to the study, we found that the main reason is that the statetransition of the traditional HMM and the Markov assumptions of observations restrictand impact on the lip-reading recognition applications. In this paper, we present amethod to improve the transition and the traditional HMM values. We exported thelearning algorithm of the new model based on traditional HMM. We also established alip-reading recognition system based on the new algorithm. This system usedAdaBoost algorithm to extract the lip and face parameters, PCA and LDA algorithmto reduce the dimension of the lip feature, VQ algorithm to handle the lip featureeigenvectors, IHMM algorithm to recognize the lip-reading. The final experimentalresults show that the recognition rate of the improved HMM is better than thetraditional HMM.
Keywords/Search Tags:lip-reading, visual feature extraction, deformable template, Fourier descriptor, HMM
PDF Full Text Request
Related items