Font Size: a A A

Efficient Facial Landmarks Tracking In Video

Posted on:2019-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y GanFull Text:PDF
GTID:2518305891473644Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Demand for intelligent processing of the face data,the most important part of visual data,continuously grows nowadays.Face alignment is one of the two fundamental techniques for face extracting and pre-processing.Its evolution can promote analysis and exploitation of the face data,from which intelligent service,public security and the entertainment business can benefit.However,because of the crucial position of face alignment in the processing and analyzing of the face data,various application scenarios require its high accuracy but only leave limited computational resources for it.To meet the demand for high processing speed,we employ only a single convolutional neural network to directly produce coordinates of facial landmarks.And based on Dense Net and VGG architecture,we proposed two convolutional neural networks,one for servers and the other for mobile and embedded devices.Both networks only contain small number of parameters and have as low computational complexity as the fastest state-of-the-art method based on convolutional neural networks.Specifically,the network for mobile and embedded devices only requires 23% of computational cost of the state-of-the-art method.To meet the demand for high accuracy,and to deal with problems of low image quality in videos and lack of training data,we propose a method that jointly train the face alignment model on multiple datasets with different facial landmark definitions,which expand the scale of the training set.At the same time,we also employ teacher-student training technique in the training of the face alignment model.As the network for servers is easy to train with high accuracy,it is used to automatically generate training samples for the other network,which further enlarges the training set.Data augmentation is also employed to improve the data diversity.Through experiments,it is shown that method of jointly training on multiple datasets and the teacher-student training technique enlarge the training set,and make the proposed models much more robust and generalized with higher accuracy.The proposed models have higher accuracy on the benchmark compared with state-of-the-art methods.Finally,a ridge regression based post-processing technique is proposed to help improve continuity of landmark estimation when tracking facial landmarks in videos.In the experiments,the post-processing technique provides significantly improvement on the continuity,which can be perceived through naked eyes.So the proposed post-processing technique can make a huge improvement on the application experience of facial landmark tracking in videos.
Keywords/Search Tags:face alignment, facial landmarks tracking, convolutional neural network, Dense Net, ridge regression
PDF Full Text Request
Related items