Font Size: a A A

Visual Saliency Detection Research And Implementations

Posted on:2021-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ChenFull Text:PDF
GTID:2428330647951585Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Visual saliency detection aims to detect specific regions that attract human most in a given image,which has been widely used in other applications,such as image understanding,image retrieval.With the development of deep learning,how to design an effective model and powerful loss function has become the critical problem in saliency detection.In this thesis,we focus on RGB saliency detection and RGB-D saliency detection,and we propose two novel models.The quantitative and qualitative experiments conducted on several datasets have demonstrated the efficiency of these two models.For current existing problems in RGB saliency detection,such as the simple feature integration strategy,we propose a Global Context-aware Progressive Aggregation Network(GCPANet).First,considering the characteristics of different level features,we design a simple yet effective method,that is,Feature Interweave Aggregation,to integrate the low-level features,high-level features,and global context information.In the decoder stage,we introduce the global context information in a parallel way in order to capture the relationship among different salient regions and alleviate the feature dilution process.The experiments conducted on 6 datasets show that the proposed network outperforms 12 state-of-the-art methods.Moreover,we notice that the performance of saliency detection relying on single-modal data would be affected by similar appearance disturbance.Hence,introducing the depth information as a supplementary information has become the next research point in this thesis.For RGB-D saliency detection,we propose a Depth Potentiality-Aware Gated Attention Network(DPANet)to handle the two main problems,that is,how to prevent the contamination from the unreliable depth map,and how to effectively aggregate the RGB information and the depth information.For the first time,we explicitly model the depth potentiality as a saliency orientation task to evaluate the potentiality of the depth map,and then weaken the contamination.To efficiently integrate the RGB information and the depth information,we design a Gated Multi-modality Attention(GMA)module which exploits the gate unit and attention mechanism to capture long-range dependencies from a cross-modal perspective.The GMA module aims to strengthen the response for salient regions,and adaptively control the fusion of cross-modal information.Finally,we exploit the multi-scale feature aggregation and multi-modality feature aggregation to generate discriminative features and the predicted saliency map.Without any preprocessing techniques(e.g.,HHA)or any post-processing techniques(e.g.,CRF),the proposed network outperforms 15 state-of-the-art methods on 8 RGB-D datasets.
Keywords/Search Tags:saliency detection, RGB information, depth information, feature integration, attention mechanism
PDF Full Text Request
Related items