Font Size: a A A

Research On Salient Object Detection Algorithm Of Multi-source Images

Posted on:2024-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:R W WuFull Text:PDF
GTID:2568307055974689Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Salient object(s)detection in multi-source images(i.e.,RGB-D and RGB-T salient object detection)is one of the research hotspots in the field of computer vision in recent years,which aims at identifying the most salient objects in a given area by using multi-modal data(RGB images and depth maps or RGB images and thermal images).Multi-source data are different representations of the same scene,and there are similarities and differences between multi-modal data.Therefore,how to realize the effective interaction between different modalities and enhance the utilization rate of features is very important to improve the performance of saliency objects.In addition,there are some similarities between the features of adjacent layers,and how to effectively use the complementary characteristics between the features of adjacent layers to dig the key clues of salient object is also an important issue to be explored urgently.Focusing on the above problems,this paper designs different models for RGB-D and RGB-T salient object detection tasks.Aiming at the insufficient multi-modal information interaction in RGB-D salient object detection,this paper proposes a cross-modal hierarchical interaction network(HINet),which mainly includes two modules: a cross-modal information exchange(CIE)module and a multi-level information progressively guided fusion(PGF)module.Specifically,the CIE module is proposed to exchange the cross-modal features for learning the shared representations,as well as the beneficial feedback to facilitate the discriminative feature learning of different modalities.In addition,the PGF module is designed to aggregate the hierarchical features progressively with the reverse guidance mechanism,which employs the high-level feature fusion to guide the low-level feature fusion and thus improve the saliency detection performance.Extensive experiments show that our proposed model significantly outperforms the existing nine state-of-the-art models on five challenging benchmark datasets.A large number of ablation experiments prove the effectiveness of the two modules proposed in this paper.For RGB-T salient object detection,this paper proposes a new parallel symmetric network(PSNet),aiming at focusing on how to aggregate the key salient clues from two modalities to enhance the salient feature representation,so as to produce accurate salient object detection results.Specifically,this paper first develops a cascaded aggregation module(CAM),which fully accumulates and excavates the valuable saliency semantics from two different modalities to strengthen feature representation by cascading the designed residual-based enhancement unit.Then,this paper designs a parallel-symmetric fusion(PSF)module to integrate crucial saliency cues from adjacent layers for saliency prediction in a parallel and symmetric manner.Besides,to make full use of multi-level features,we introduce a guidance strategy that enhances the details of the saliency map with the low-level features to boost the performance of salient object detection.Extensive experiments show that our proposed model significantly outperforms the existing fifteen state-of-the-art models on three challenging benchmark datasets.Moreover,the superior performance on RGB-D SOD also demonstrates the generalization and robustness of the proposed method.
Keywords/Search Tags:RGB-D salient object detection, RGB-T salient object detection, cross-modal interaction, feature fusion
PDF Full Text Request
Related items