Font Size: a A A

Feature Fusion For RGB-T Image Salient Object Detection

Posted on:2021-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:T L XiaoFull Text:PDF
GTID:2518306050968919Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Image salient object detection aims to quickly and accurately locate the contents of people's interests in a large number of images,which means to detect the most attractive objects in different scenes.Due to its high theoretical research value and practical application significance,salient object detection has attracted wide attention in the fields of image processing,computer vision,pattern recognition,artificial intelligence and so on.While many RGB-based saliency detection algorithms have recently shown the capability of segmenting salient objects from an image,they still suffer from unsatisfactory performance when dealing with complex scenarios,such as insufficient illumination,complex background,and there is noise interference.To overcome this problem,this thesis studies RGB-T saliency detection,which takes advantage of thermal infrared cameras' robustness against changes in illumination and weather,and thermal infrared images are used to provide complementary information for RGB images.Then,this thesis proposes a new RGB-T image salient object detection algorithm.The main contributions of this work are as follows:First of all,as we know,RGB images are sensitive to changes in some environmental factors,such as illumination and weather.The current RGB-T image salient object detection methods can't adaptively fuse multi-modality complementary information,and the rich contextual information learned from deep convolutional network models has not been fully utilized.Considering the above-mentioned issues,this thesis studies a series of feature fusion problems for mining potential RGB-T saliency issues and proposes a novel feature fusion based RGB-T image salient object detection approach.The three studied feature fusion problems are multi-scale feature fusion,multi-modality feature fusion,and multi-level feature fusion,respectively.Then,for multi-scale feature fusion,a hybrid pooling-atrous(HPA)module is proposed to capture multi-scale contextual information at each single-modality feature learning branch.The multi-scale contextual features obtained by this module have stronger representational ability and better spatial consistency by enlarging the receptive fields of features and capturing stronger local details.For multi-modality feature fusion,a complementary weighting(CW)module is designed to adaptively fuse complementary information from multimodality features.Compared to the existing multi-modality feature fusion strategies,the proposed module can adaptively fuse multi-modality information by learning the contentdependent weight maps,which are able to measure the credibility of the feature maps across different modality from a global view and provide a way to fuse complementary information related to salient object areas as far as possible.After fusing multi-modality features in different level,these fused features need to be integrated to generate saliency maps with accurate semantics and fine boundaries,and a semantic guidance(SG)module is designed to screen the superfluous information in low-level features and acquire semantic-aware lowlevel features,which is implemented by using the global semantic features from the deepest network layer to gate the forward flow of the low-level features so that the useful information is transmitted and superfluous information is suppressed.In the end,the final saliency map is achieved by multi-level feature fusion results.Finally,the proposed algorithm is implemented on the Tensor Flow deep learning framework with the python programming language,and a NVIDIA GTX 1080 Ti GPU is used for training and testing.Comprehensive comparative experiments are conducted on multiple public RGB-T image datasets,and the proposed algorithm are compared with the state-of-the-art image salient object detection algorithms in objective evaluation metrics and subjective prediction saliency maps.The experimental results demonstrate that the proposed approach outperforms other state-of-the-art methods by a large margin,especially in complex scenarios,such as insufficient illumination,complex background,and there is noise interference.The proposed algorithm can more effectively fuse multi-modality complementary information,and salient object areas have better consistency.What's more,background interference information can be effectively suppressed.
Keywords/Search Tags:RGB-T image salient object detection, feature fusion, multi-scale, multi-modality, multi-level
PDF Full Text Request
Related items