Font Size: a A A

Research On The Technology Of Lipreading Recognition In Video Sequences

Posted on:2006-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:H TaoFull Text:PDF
GTID:2168360155967246Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently, computer lipreading technology has received the concern of more and more researchers, which is regarded as an auxiliary method for Automatic Speech Recognition (ASR). The paper gives a comprehensive survey and analysis of the existing lipreading technique, and describes the methods of current lipreading feature extraction. By systematically analyzing of relevant algorithms, we present some novel algorithms for lipreading based on video sequences, such as lip detection, feature extraction and lipreading recognition. In addition, a prototype system of lipreading is designed and implemented. The main work is listed here:(1) A new approach based on double-difference is presented to locate the lip region in video sequences. Firstly, we use the techniques of scale normalization, grayscale equilibrium and rotation of images to preprocess face images in video sequences, then trace the transformation of lip-shapes in real time. Secondly, we obtain the two difference-image of three successive frames, then get a double-difference image by operating these two successive difference-image with AND operation. Thirdly, we project the double-difference image in vertical and horizontal direction, then perfectly segment the whole lip from the face according to the break-points of projected images. The experimental result shows that our approach is efficient in detecting lip region in real time.(2) A mixed method based on SVD-QR and deformable template is proposed to extract lipreading feature. In the method, SVD-QR extracts grayscale features, and deformable template extracts geometrical features. This method is insensitive to the factors, such as illumination, noises, and the distances between cameras and subjects, which influence the feature parameters of lipreading greatly in other methods. The features extracted by the method have the advantages of the lower dimension, more information and more applicable in natural conditions over other available methods.(3) The BP neural network algorithms for lipreading is proposed. According to the geometrical characters, we classify the lip-shapes into two classes, and then design networks to recognize lip-shapes in each class. With the combination of additional momentum and adaptive speed, we get a rapid BP neural network algorithm, and which is more applicable in video conditions. By this two-stage classification, the searching space of BPNN is decreased which make the recognition fast enough to reading the lip in real time. The results indicate that our method is robust and successful in the conditions of dependent subjects and independent subjects.(4) A prototype system of lipreading based on component development is designed and implemented. Based on the idea of oriented object, we divide the system into four modules that is image collection, lip detection, lipreading feature extraction and lipreading recognition, and develop corresponding components. With these components, the coupling degree between objects is reduced, which make our system more reusable and transplantable.
Keywords/Search Tags:pattern recognition, image processing, dual-difference image, mixed feature, BP neural network, design pattern
PDF Full Text Request
Related items