Font Size: a A A

Research On RGB-D Image Saliency Object Detection Algorithm Based On Deep Learning

Posted on:2024-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:F WuFull Text:PDF
GTID:2568307151459624Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Salient Object Detection simulates the human visual attention mechanism through a vision algorithm that separates the foreground from the background using a complete detection model to extract the most salient objects in an image according to the human eyes.In recent years,salient object detection has played an important role in various fields of machine vision? for example,it can be used as a pre-processing component for many vision tasks such as target tracking,image description,and image segmentation.In recent studies,deep learning methods have been shown to be effective in the field of image saliency detection.Researchers have found that the import of depth maps allows networks to obtain more spatial information and better research results,this advantage has made the RGB-D dual-input stream approach the dominant research method.Although the structure of RGB-D has led to great progress in the saliency detection field,there are still many problems in RGB-D saliency detection.Due to the introduction of a low-quality depth map,it tends to mislead the feature fitting of the model,therefore,the performance of saliency maps in terms of consistency of edges and images still needs to be improved.In this paper,we will look at how to improve interaction between modality and how to solve problems like edge blurring and internal hole phenomenon that have been observed in previous studies.The contributions of this paper are as follows.(1)To solve the above problems,this paper proposes an RGB-D saliency object detection network based on mutual-attention-aware and edge reinforcement(MAAERNet).MAAERNet consists of three subnetworks,namely RGB and depth channel subnetworks,and a shared subnetwork.RGB and depth channel subnetworks are responsible for encoding and decoding RGB images and depth maps,respectively generating saliency maps for specific modes.At the front of the decode part of RGB and depth channel subnetworks,a self mutual attention module is added to calculate the degree of mutual attention between RGB and depth feature maps to aggregate the features between the two modalities,enabling long-distance global information to be widely disseminated.A Biconmodule was added at the end of the decoder section of the shared subnet to refine the output of the shared decoder to generate a shared saliency map with clear edges and high spatial consistency.At the same time,this article has conducted validation on four datasets widely used for saliency object detection,and compared the prediction results of this method with the previous 30 saliency object detection methods under four evaluation indicators.The prediction results of this method have improved to varying degrees in various indicators on NJU2 K,STERE,and other datasets.(2)This article also proposes an RGB-D saliency object detection network based on robust modal interaction and information refinement.Benefiting from the inspiration of hierarchical progression and attention mechanisms,this article proposes a layer-progressive attention module that can reduce the impact of low-quality depth maps on the network by enhancing network robustness.In this paper,a refined middleware structure is added between the encoder and decoder for better interaction between modalities and to avoid redundant information.In this structure,the features of RGB,depth,and RGB-D encoders are further refined by the successive use of self-modal attention refinement units and cross-modal weighted refinement units.This paper also proposes a convergent aggregation unit,through which the decoder feature vectors are filtered by the decoder to further refine the indistinguishable features.Finally,the network obtains saliency maps on all three tributaries,which are then supervised by the true values separately.Extensive comparison experiments of this network on six popular RGB-D SOD benchmark datasets and the latest 11 models show that the network in this paper outperforms the others in terms of quality and numerical values.
Keywords/Search Tags:Saliency object detection, Deep learning, Multi-Scale fusion, Attention mech-anism
PDF Full Text Request
Related items