Font Size: a A A

Study Of Visual Fixation Detection Algorithm Based On Image Emotion

Posted on:2020-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y M MaFull Text:PDF
GTID:2428330596482937Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Humans have excellent ability to selectively focus on some areas of a scene based on low-level attributes(such as intensity,color)and semantic level information.Such attributes are incorporated into computational models that have significant effects on predicting visual saliency,and those models have been used in applications such as automatic image annotation and video surveillance.Traditionally,saliency prediction algorithms focus on identifying gaze points that human observers will focus on at first glance,while others focus on finding the most prominent and important targets in the image.This article focuses on the first type of saliency model study,which attempts to predict the probability distribution of visual fixation in an image.In order to further study the visual fixation saliency detection task,two visual fixation detection models based on image sentiment semantic information are proposed.Firstly,the first deep learning model for visual fixation detection is RIL-DCN.This algorithm model combines the last two levels of high-level features extracted by the feature extraction network VGG16 into a novel RIL(Residual Inception-Like)sub-module proposed in this paper.Which enables the network to extract image features based on different receptive fields in the same layer while optimizing network training.Due to the use of dilated convolution,the sub-module can capture more global emotional semantic information,so that the entire network can better understand the semantics of the image,so as to accurately detect the salient regions in an image.However,the model does not select the extracted features,which means that all features are of equal importance.In order to solve this problem,this paper proposes the second deep learning model for visual fixation saliency detection-ME-CASA.The biggest contribution of this algorithm model is that it proposed a novel CASA(Channel and Spatial Attention)sub-network.This sub-network can effectively encode the emotional semantic information of the image.By assigning different importance weights to different features at the channel level and the spatial level,the entire network model can accurately locate the priority of the salient regions in an image and accurately find the most significant target area in the image.The feature extraction part of the network is changed from the fusion of different levels of features at the same resolution to the fusion of high-level semantic features at different resolutions extracted by the dual-stream VGG19 network.Such an overall network design helps to better extract the emotional semantic information of the image and accurately locate the most prominent target area to achieve a computer vision attention simulation that is closer to the human eyes.In this paper,two proposed visual fixation detection algorithm models are tested on two publicly available saliency datasets with emotional content,and compared with the other eight excellent models on seven significant metrics.The experimental results show that compared with other advanced algorithms,the first visual fixation saliency detection model proposed in this paper achieves better performance,and can obtain predictions that are closer to the ground truth,while the second visual fixation detection model not only outperforms other models in performance indicators,but also can extract the emotional semantic information of the image to accurately locate the priority of the significant target area in an image,which is mainly due to the proposed CASA algorithm.
Keywords/Search Tags:Visual fixation, Saliency, Semantic information, Emotional context, Feature selection
PDF Full Text Request
Related items