Autism spectrum disorder is one of the most common neurodevelopmental disorders in childhood,and its incidence is increasing year by year.At present,screening for autism mainly relies on doctors or related professionals to conduct medical diagnosis through behavioral observations and designate different intervention programs based on individual differences.This method requires professional and experienced physicians to consume a lot of time.However,the number of autistic patients is huge and the number of experienced therapists is seriously insufficient.Therefore,autism is in urgent need of new technological methods to meet the challenges.According to the American Psychiatric Association,clinical professionals mainly make clinical diagnosis of autism through defects in social,language communication,joint attention,and abnormal behavior.And these defects can be manifested through information such as gaze,expressions,and movements.Therefore,research based on computer vision provides a new solution for early screening of autism.The main research contents of this study are as follows:1.This paper firstly studies the facial expression recognition based on static pictures and dynamic videos.For facial expression recognition based on static images,this paper combines the attention mechanism with the highresolution network,and designs the SE-HRNet model.While maintaining high-resolution information,pay attention to the importance of each channel,and use deep separable convolution to reduce the amount of network parameters.At the same time,FRN was used to optimize the normalization method,and finally an accuracy of 97.12% was obtained on the CK+ data set.For facial expression recognition based on dynamic video,this article combines convolutional neural network with long-short-term memory network.It integrates time information and space information to a large extent,and uses hole convolution to increase the receptive field of the network,and finally obtains an accuracy of 98.4% on the CK+ data set.2.This paper studies the line of sight estimation problem.First,a head pose estimation model based on multi-view information fusion is designed,and an accuracy rate of 92.18% is obtained through the fusion of four-view information on the DPOSE data set.Then design a face key point detection model based on heat map regression,and use Adaptive Wing Loss to optimize the loss function of the model,and finally perform well on the300 W data set.Finally,a gaze estimation network based on human eye positioning information is designed.In the later stage of the model,head posture information and pupil center data are combined with gaze features to achieve network performance,which finally surpasses Gaze Net and other algorithms on the MPIIGaze dataset.3.Based on the needs of early screening for autism,an early auxiliary screening system for autism is designed,and the aforementioned algorithm in this article is applied to the system.The system provides three different interactive modes of pictures,videos and games for the subject.During the interaction process,it can perform real-time facial expression recognition and gaze estimation of the subject. |