Research On Technology Of Lip Reading Fused Physiological Information

Posted on:2019-10-23

Degree:Master

Type:Thesis

Country:China

Candidate:F Yang

Full Text:PDF

GTID:2428330626452090

Subject:Computer Science and Technology

Abstract/Summary:

As a bridge between people and computers or other devices,human-computer interaction technology has experienced a significant change from mouse and keyboard to non-contact interaction of multi-modal information under the drive of intelligence technology and demand.As an important non-contact interaction method,lip-reading technology has not only broke through the limitations of application scenarios,assists speech recognition in noisy environments,but also has a broader development prospect with the emergence of three-dimensional sensor.The comprehensive extraction and effective characterization of lip motion information is directly related to the accurate expression of semantic information.The completeness and representation of lip-motion feature extraction directly affect the recognition of semantic content and the judgment of semantic emotion.For lip-motion feature extraction,the common difficulty is that the feature extraction method can't be used as a general method to comprehensively and effectively represent lip-motion information.So,this paper aims to study multimodal lip reading studies integrating facial muscle physiology information.The research content mainly includes Kinect-based multimodal data acquisition,preprocessing,facial muscle model building,muscle model mapping,feature extraction and DenseNet-based training recognition.First,multi-modal information including audio,color image and depth data were collected based on Kinect V2.0 during the lip movement of the speaker.After that,a series of pre-processing operations were performed on the data.For the image data,face detection,lip positioning,and data augmentation were sequentially performed.For the depth data,a series of unconscious head movements such as turning,hoeing,looking up,bowing,etc.during the recording of the speaker are corrected.Then,the paper studied the facial muscle physiological information and establishes a vector muscle model with a small number of parameters.Based on the acquired 1347 facial feature points,the established muscle model was mapped into the three-dimensional facial model.Based on the established muscle model,the paper extracted two types of features,namely geometric feature and physiological feature.Geometric feature includes shape feature and angle feature,physiological feature includes muscle length feature and muscle displacement feature.Finally,the paper used DenseNet for a lip reading experiment.The discovery proved that the addition of depth information can improve the recognition rate of the lip reading system,and the physiological characteristics proposed in the paper can indeed enhance the constraint between three-dimensional discrete points and more fully characterize the lip movement process.In addition,the paper studied the tones and consonants,and found that it is feasible to distinguish tones and constants by only visual information.

Keywords/Search Tags:

Lip Reading, Facial Muscles, Physiological Feature, Kinect, DenseNet, Feature Extraction

Related items

1	Research On Emotion Recognition Methods Based On Facial Expressions And Physiological Signals
2	Research On Iris Feature Extraction And Recognition Algorithm Based On Improved DenseNet Network
3	Research On 3D Facial Feature Extraction Used In Diagnosis Of Facial Morphology
4	Research On Technology Of Real-time Lip Reading Based On Kinect 3D Camera
5	Research On Computerized Facial Esthetics
6	Study On The Extraction Method Of3D Facial Feature
7	Research On Related Algorithms Of 3D Reconstruction Based On Kinect
8	Studies On Physiological Perception Feature Extraction Methods In Underwater Target-radiated Noise
9	Research On Facial Feature Extraction
10	Research On Algorithms Of Feature Extraction For 3D Facial Expression Recognition