Font Size: a A A

Research And Implementation Of 3D Gaze Estimation Method For Resource Constrained Scenarios

Posted on:2022-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z WuFull Text:PDF
GTID:2518306569494574Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Line of sight usually represents the focus of a person's attention.Line of sight information can assist machines in understanding a person's behavior and intentions.gaze estimation currently includes three directions: gaze target estimation,gaze point estimation,and 3D gaze estimation.gaze estimation has great application in humancomputer interaction,attention analysis,and video surveillance.Since gaze estimation is mainly applied in devices such as cell phones and cameras.These devices have very limited resources in terms of memory,computing power,and energy consumption.Therefore,it is necessary to study how to implement gaze estimation efficiently and inexpensively in resource-constrained scenarios.In this paper,we investigate the efficient implementation of 3D gaze estimation in resource-constrained scenarios.This paper presents an end-to-end 3D gaze estimation algorithm based on multi-task learning,which outputs a corresponding gaze while performing face detection,to address the problem that traditional 3D gaze estimation methods must be preceded by a detection algorithm,resulting in a long and slow algorithm flow.The method can simplify the whole process and improve the speed of gaze estimation,which is useful for achieving efficient gaze estimation in resource-constrained scenarios.A multi-task learning approach is used to learn both the detection data domain and the gaze data domain,so that the model's performance on both face detection and gaze estimation tasks is similar to that of singletask learning.Because the labeling of 3D gaze estimation data is different from that of traditional visual tasks,it is more difficult to label than ordinary tasks,and the labeling criteria are not uniform.Therefore,3D gaze estimation data annotation is not easy to obtain and is expensive.it is necessary to reduce the demand for data by self-supervised learning.In this paper,self-supervised learning is incorporated into an end-to-end gaze estimation algorithm,which makes use of large amounts of unlabeled data to enable the model to learn the gaze characterization,while using the optical flow field regularization method allows the model to learn more robust features.This method yields a model that can be trained with only a small amount of labeled data,nearly as well as with the full data set.For resource-constrained scenarios where the memory,computing power,and energy consumption of the device are limited,and the model size and computation volume of deep learning are very large.In this paper,lightweight networking and model quantization techniques are used to reduce both the size and computation of the model,which is deployed on edge devices for real-time gaze estimation without significant degradation in accuracy.
Keywords/Search Tags:gaze estimation, face detection, self-supervised representation learning, model compression
PDF Full Text Request
Related items