Font Size: a A A

Research On RGB-D Salient Object Detection Based On Cross-modal Fusion

Posted on:2024-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y ShiFull Text:PDF
GTID:2568307172481874Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Saliency object detection aims to simulate human visual attention mechanism to detect and locate the most attractive area in the object image.As one of the pre-processing steps in the field of computer vision,this technique is widely used in image retrieval,object recognition,semantic segmentation and other tasks.In recent years,although significant progress has been made in saliency object detection based on RGB images,there is still room for improvement in the detection effect under complex background.With the rapid development and application of depth sensor equipment,depth map,as supplementary information of RGB images,is widely used in saliency object detection.RGB-D salient object detection has gradually become the focus of research.Therefore,it is of great salient to study how to improve the quality of Depth map in the salient model,differentiated processing of high and low level features,and how to effectively extract and integrate the two types of cross-modal features,RGB and depth.This paper takes RGB-D salient object detection algorithm as the research content,focuses on crossmodal feature extraction and effective fusion at all levels,and proposes two salient object detection models: Aiming at the problem that previous methods adopt the same feature fusion strategy at all levels,an RGB-D salient object detection model based on multi-level feature fusion is proposed.Specifically,the model adopts the dual-flow network architecture to extract the features of RGB and depth information at different levels.Meanwhile,in order to reduce the negative impact of low-quality depth maps on detection,depth enhancement module is used to process the extracted depth features at each level.Considering the difference of global feature contribution and local feature contribution in different levels,different feature fusion strategies are designed at high and low levels respectively to effectively integrate the features of two modes at different levels from the top down.Through comprehensive experiments with seven advanced models on five public datasets,the model shows good performance.Aiming at the problem of processing cross-modal global context information and fusion features in salient object detection,an RGB-D salient object detection model based on global awareness and adaptive modal fusion is proposed.In order to provide multi-scale features in the feature fusion stage,a global awareness module is proposed to enhance and improve the global context by pooling operations at different scales.Secondly,an adaptive fusion module is designed to select and fuse cross-modal features.Considering that the fusion features of different levels may contain complementary information,a decision-level fusion module is proposed,which further optimizes the fusion features by grouping and aggregating the hierarchical features to improve the accuracy of detection.The superiority and robustness of the proposed model are proved by comprehensive experiments with ten advanced models on six datasets.
Keywords/Search Tags:RGB-D salient object detection, Cross-modal feature, Convolutional neural network, Feature fusion
PDF Full Text Request
Related items