Image salient object detection originates from the human visual system,which helps people locate the interested parts and selectively ignore the uninterested parts in the complex scenes.RGB-D salient object detection has recently gained great popularity in computer vision.The RGB information and corresponding depth information in RGB-D images complement each other.The appropriate use of spatial information can represent the features of objects more accurately in the scene.However,due to the inherent difference between RGB information and depth information,the simple aggregation of multi-modal features is not effective to detect salient objects.The inconsistency of multi-modal features also increases the difficulty of the subject.In addition,how to fully extract and accurately locate the effective information for RGB and depth images is also an urgent problem to be further explored.In this thesis,two different RGB-D salient object detection models are designed for these problems.In this thesis,we propose complementary attention and adaptive integration network,a novel RGB-D salient object detection model that optimizes the feature localization and deals with the multi-modal features inconsistency.The algorithm mainly includes two modules,which are the context-aware complementary attention module and the adaptive feature integration module.The context-aware complementary attention module consists of three components.The feature interaction component captures the local context information by means of dense connection.The complementary attention component refines the features and effectively suppresses noise interference.The global-context component uses global features to guide and complement the details.The output from the context-aware complementary attention module is then fed to the adaptive feature integration module which adaptively fuses the RGB features and depth features at different levels according to their contributions of each branch to the saliency detection.And the best fusion state is achieved by self-optimization.Extensive experiments on five challenging benchmark datasets demonstrate that the proposed model is an effective salient object detection model.Moreover,extensive ablation studies confirm the effectiveness of the proposed two modules.In addition,we propose a depth-induced multi-level feature interaction network,a novel RGB-D salient object detection model that optimizes the feature extraction and deals with the multi-modal features inconsistency.The algorithm mainly consists of two modules,which are the cross-modal feature fusion module and cross-level feature interaction module respectively.The cross-modal feature fusion module adopts the asymmetric network and enhances the RGB features under the guidance of the depth feature to solve the inconsistency problem in cross-modal data.The output from the cross-modal feature fusion module is then fed to the cross-level feature interaction module,which fuses the features hierarchically in a dense and interwoven manner to extract complementary information.The experiments on five public datasets demonstrate that the model improves the performance effectively.In addition,extensive ablation studies confirm the effectiveness of the proposed two modules. |