Salient object detection(SOD)aims to simulate the human visual attention mechanism to detect and accurately segment attractive object areas in images.As an image preprocessing method,SOD has been widely used in various fields of computer vision,such as Semantic Segmentation,Image Retrieval,Object Tracking,Person Re-Identification.In recent years,salient object detection has attracted extensive attention of many scholars,and many salient object detection algorithms have been proposed.On the one hand,the powerful feature extraction capability of deep learning greatly improves the performance of saliency detection in terms of accurate positioning and accurate edge segmentation.On the other hand,Depth images with rich spatial information are used as complementary information of RGB images to improve the performance of saliency detection.In view of this,this paper carries out salient object detection related research work on RGB images and RGB-D images.(1)Based on RGB images data,we propose a global feature guided residual attention network(GCRANet)for RGB salient object detection.Firstly,considering that the low-level features have rich detail information and the high-level features have rich semantic information in the encoder,we design a global feature information complementary module to complement the high-level and low-level feature information.Secondly,in the output part of each layer of the encoder,a multi-scale parallel convolution module is designed to capture the multi-scale information of the feature graph.Then,in the decoder stage,the network fuses the output features of different modules through feature cascade fusion modules,and uses residual spatial attention module and residual channel attention module respectively to capture important foreground object information at the outputs of different levels of the decoder.Finally,we adopt a new multi-level loss function to optimize the training process of the model.The GCRANet method proposed in this paper is compared with 15 advanced methods on 6 public RGB datasets.The qualitative comparison results show that the proposed method can segment salient object more clearly.Quantitative comparison data shows that the method proposed in this paper has significantly improved in each evaluation index.For example,in the ECSSD dataset,the F-measure score of GCRANet reaches 0.947,which is closer to the ideal value 1.(2)Depth images can provide additional depth cues to RGB images,which provides a reasonable solution to the performance enhancement problem of salient object detection.For this purpose,we propose a multi-feature cascaded fusion network(MCFNet)for RGB-D salient object detection.The MCFNet mainly includes Depth cascaded branching,RGB cascaded branching,cross-mode fusion mechanism and multistage loss function.Firstly,we design a Depth preprocessing algorithm in the Depth cascade branch to improve the quality of Depth images,and design a cascaded cross-modal guide module to guide the feature extraction process of RGB images.Secondly,the RGB cascade branch contains five residual adaptive selection modules,which are used to capture the multi-scale features of the RGB feature extraction process.The cross-modal fusion mechanism is then used to fuse the top-level features of the RGB cascaded branch and the Depth cascaded feature branch.Finally,we adopt multi-stage loss function to supervise the training process of the model.The proposed MCFNet method and 12 advanced methods are compared and analyzed on 6 public RGB-D datasets.The qualitative comparison results show that the edge contour of the proposed method is clearer and closer to the groundtruth.Quantitative comparison data on each evaluation index shows that MCFNet’s detection performance is better than other comparison methods.For example,in SIP dataset,MCFNet’s E-measure score reach 0.932,which exceed all comparison methods.In this paper,we propose two different salient object detection models for RGB and RGB-D images data respectively.Experimental results show that the proposed algorithms have great detection performance,and provide effective model solutions for computer vision in salient object detection. |