Font Size: a A A

Research On Human Scanpath Based On Convolutional Neural Networks

Posted on:2020-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:W T BaoFull Text:PDF
GTID:2428330599952064Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
Human scanpath represents the sequence of human eye fixations when observers free-view a natural scene,revealing the dynamic process of eye movement.Compared with the visual saliency,human scanpath is better to mimic the visual search behavior of human,thus deepening the understanding of the dynamic visual attention for researchers in this field.Therefore,the in-depth investigation on the topic of human scanpath prediction is scientifically valuable to advance the research on both the vision behavior and the intelligent robotic perception.For the related work on human scanpath prediction,some methods are directly built on the physiological principles of human visual system.These methods can well handle the temporal dependence between fixations of a scanpath and thus they are more interpretable.However,they achieved inferior performance due to the handcraft design of visual feature.The others take the merits of deep learning models to directly learn the patterns of images and eye tracking data.These methods are powerful to predict human scanpath but few of them explicitly exploit the physiological principles,leaving the space for improvements.In this thesis,a novel computational model is proposed to predict human scanpath under free-viewing condition by integrating the physiological principles into the deep learning framework.Specifically,the foveal saliency model is introduced to simultaneously mimic the selective attention and the foveal visual memory of the local image region.In addition,the fixation duration prediction model is proposed to simulate the relationship between the temporal behavior and the region of interest.Finally,the foveal visual memory and the fixation duration are naturally integrated into the inhibition of return mechanism,which is a physiological guaranteed method.The modules of the proposed method are closely integrated so as to fulfill an end-to-end scanpath prediction with only a single image as input.The proposed method is capable of handling the challenges of temporal dependency and spatial association with image content,which are both critical for scanpath prediction.The proposed algorithm has been evaluated on the public eye tracking dataset and has achieved state-of-the-art performance with multiple evaluation metrics.In addition,some critical aspects of the proposed method have been demonstrated to be effective by ablation study.Finally,further improvements are discussed in this thesis.
Keywords/Search Tags:Scanpath prediction, Deep learning, Visual attention, Inhibition of return, Foveal vision
PDF Full Text Request
Related items