Font Size: a A A

Research On Saliency Prediction On Omni-directional Images Based On Deep Learning

Posted on:2021-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:H S ZhuFull Text:PDF
GTID:2428330611965325Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The visual attention mechanism enables humans to quickly prioritize perceptual resources to the most critical areas,helping to quickly analyze complex scenes.As a way to simulate human visual attention mechanism,saliency prediction is of great significance to the construction of human-computer interaction and auxiliary systems.Especially with the rise of virtual reality technology,saliency prediction on omni-directional images has received more and more attention due to its advantages in reducing the complexity of high-level visual tasks and assisting in the study of human visual mechanisms.Aiming at the problems of insufficient saliency dataset and the poor saliency prediction effect caused by severe distortion on omni-directional images,the main research contents of this paper are as follows:(1)Limited by the lack of large-scale omni-directional saliency datasets,it is necessary to use traditional saliency datasets to design and train saliency prediction models before applying them to omni-directional images.The multi-layer features of the deep model can mine information at multiple levels of the image,and integrating the features of multiple levels can effectively improve the accuracy of saliency prediction.However,there are large semantic gaps between different levels of features,and the receptive fields on high-level features are usually insufficient.Aiming at these problems,this paper proposes a new model with attentive receptive fields and contextual awareness.The model first uses the deformable attention module to effectively focus the limited receptive field of the model to the key areas;then it applies a context-aware feature pyramid module to reduce the semantic gap between different levels of features and introduce context-aware information to multi-level features;finally,the multi-level features are integrated to obtain the final saliency prediction result.In the experiments,the proposed model is compared with other mainstream models on multiple benchmark datasets.The proposed model has better performance on multiple indicators.(2)Compared with traditional 2D image,the omni-directional image has many differences,such as the serious distortion in polar regions,the wider field of view,and the left and right border being continuous in the content.These differences make the saliency model trained on the traditional dataset unable to perform well on omni-directional images.To solve this problem,this paper proposes a new framework for saliency prediction on omni-directional images.This framework proposes multiple sphere rotation and reversed multiple sphere rotation methods.These projection methods make it possible for severely distorted image areas to appear near the equator with less distortion to get better predictions,and also solve the problem of discontinuities in the left and right regions.Combined with the proposed model mentioned above,it is verified that the method can effectively improve the saliency prediction performance of omni-directional images,and outperforms the state-of-the-art models under different evaluation metrics on the public saliency benchmarks.
Keywords/Search Tags:Vision, Saliency Prediction, Deep Model, Omni-directional Image
PDF Full Text Request
Related items