| Salient Object Detection(SOD)automatically captures the most attractive objects or regions in a scene by simulating human visual perception mechanisms,and then segments these salient objects or regions in the form of binary maps.In current era,it has become an important research topic to process huge amounts of images,text,speech and video data faster and more efficiently.Salient object detection can help people quickly capture valuable information from massive data and filter out invalid information.Besides,SOD widely used in many fields,such as intelligent surveillance,web retrieval,object tracking,image editing,etc.With the development and progress of technology,depth information acquisition devices have also made great progress,many convenient depth sensors,such as Kinect camera and Huawei Mate30,are widely used in the field of computer vision,especially in 3D vision.Since depth map can provide depth information including spatial information,edge details,3D distribution,etc.,it is introduced into the field of saliency detection,called RGB-D Salient Object Detection(RGB-D SOD).Although current saliency detection models can achieve good results for simple scenes,there are still some shortcomings that need to be solved when facing complex scenes and uncertain image quality problems.In this thesis,we mainly investigate RGB-D SOD,and propose three RGB-D SOD detection methods by analyzing the existing models,and verify the effectiveness of the proposed methods by quantitative and qualitative comparison analysis and sufficient ablation experiments.The main contributions and the innovative points of this paper are shown below:(1)An encoder steered multi-modality feature guidance network for RGB-D salient object detection is proposed.The method constructs a multi-modality bidirectional cyclic interaction module with spatial attention and channel attention for efficient multimodal feature fusion.And the interaction module is embedded into the feature decoder of the dual stream to obtain deeper level multi-modality information.In addition,a deep-level feature-guided multi-scale decoder is designed for generating predicted saliency maps.Experimental results show that the proposed model achieves advanced results with performance compared to current mainstream RGB-D saliency object detection algorithms.(2)A global contextual exploration network for RGB-D salient object detection is proposed.The algorithm constructs a global high-level semantic parse mechanism from micro and macro to achieve multimodal feature fusion and multiscale feature learning.A multi-modality contextual feature module is designed from a microscopic single-scale fine-grained perspective to capture larger perceptual field information via stacked convolution operations.From the macroscopic perspective of multi-level feature aggregation,a dense top-down feature integration operation is constructed to enable the model to focus higher-level features.Experiments show that the model performance outperforms other SOTA RGB-D salient object detection methods.(3)A multi-modality and hierarchy-aware decision network for RGB-D salient object detection method is proposed.The method uses edge detection techniques of low-level features to determine the quality problem of depth images,them generates a quality score to measure the quality of depth images.High-level features with rich high-level semantic information are used to obtain a region-aware attention mask,which is used to characterize salient regions and localize them roughly.The model effectively proposes the negative impact of low-quality depth maps by this discriminative multi-level feature resolution mechanism and achieves the current state-of-the-art performance.Figure [32] Table [18] Reference [134]... |