Font Size: a A A

Salient Object Detection With Visual Attention Based Convolutional Neural Networks In Dynamic Scene

Posted on:2019-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:2428330566977969Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Salient object detection in dynamic scene is a promising research direction in computer vision area,it aims to mimic the function of human's visual attention mechanism in screening out the most interested information from massive scene data quickly.Dynamic saliency detection usually faces three big challenges,the first one is identification and extraction of salient features,traditional methods represented by Itti-Koch algorithm are overwhelmingly depended on manually designed features,and the computing framework is rather complicated and inefficient.The rising of Convolutional Neural Network(CNN)based algorithms shed light on feature extraction and representation,by utilizing supervised learning and optimization methods we can get more expressive and abstract features.The second challenge is the running speed of model in dynamic scene,in order to have variant salient features,traditional methods will serially execute pixel computation many times,which causes difficulty for balancing model speed and accuracy.The last big challenge is the lacking of supervision from top-down attention,traditional methods mainly use bottom-up low-level features like colors,density and orientation to detect salient objects,it can hardly approach human's ability under a task-driven situation.In this paper,we focus on the aforementioned challenges,our work is as follows.We first introduce a CNN model called U-Net,which has been proved successful in image semantic segmentation tasks,by improving U-Net's architecture and training methods we design a light end-to-end model for our salincy detection task.Also,we adopt a modified fully connected conditional random field(DenseCRF)algorithm to optimize saliency map produced by U-Net.A top-down visual attention based saliency detection method is then introduced.Taking advantage of a CNN model trained for image classification tasks,we can get a class activation map of input image for a given object in a given layer.The class activation map is then fused with feature visualization map of input image to form an attention map.By interpolating attention map to input size and linearly fusing it with prior saliency map we have the focus map,which indicates the salient degree for each pixel under class attention supervision.When calculating attention map,we use contrast inhibition to enhance robustness for detecting the given class objects.We evaluate our improved U-Net and DenseCRF based salient object detectionmodel on SED2,Judd,ECSSD and PASCAL-S benchmark,the results show that our model surpasses traditional methods by a large margin in all aspects,and is competitive with the newest CNN based models with high accuracy while much lower complexity.The experiments on several scenes of DAVIS dynamic dataset and real indoor and outdoor scenes also prove that our model enjoys a good balance between accuracy and speed.In addition,we verify that adding top-down attention supervision into salient object detection task can efficiently improve detection accuracy under class-supervised circumstances.
Keywords/Search Tags:dynamic saliency detection, CNN, U-Net, top-down, visual attention
PDF Full Text Request
Related items