Salient object detection task aims to acquire the most visually prominent regions in images by simulating the human visual attention mechanism.Benefiting from the powerful feature extraction capability of neural networks,the deep learning-based salient object detection models can extract both the high-level semantic features and the low-level spatial details.However,how to better refine and fuse the extracted multi-level features(i.e.,multilevel feature refinement and multi-level feature fusion)are still two key issues for deep salient object detection.In this thesis,taking some challenges of the existing deep models into account,two different salient object detection models are proposed to solve the above issues.The detailed contents are as follows:(1)To take full advantage of the extracted multi-level features,we further explore the interrelations between different features,and present a new salient object detection model(MGuid-Net)based on multiple guidance mechanisms.Since boundary information is beneficial for locating and sharpening salient objects,edge features are utilized in our network together with saliency features for SOD.In terms of the overall architecture,MGuid-Net mainly consists of three modules: self-guidance module,cross-guidance module,and accumulative guidance module.The self-guidance module and cross-guidance module are used for multi-level feature refinement,while accumulative guidance module is utilized to fuse refined multi-level features.Specifically,to refine these extracted features,a selfguidance module is respectively applied to multi-level saliency and edge features for gradually delivering the high-level semantic information to low-level spatial features via layerwise guidance.Additionally,a cross-guidance module is used between saliency features and edge features,making full use of their complementarity.To better integrate the refined multi-level features,an accumulative guidance module with hierarchical structure is introduced into MGuid-Net.It makes full use of high-level semantics and low-level spatial details through stacking and reusing each level feature in a hierarchical manner so that the fused features contain more complete saliency information.In addition,a new pixelwise contrast loss function is adopted,which acts as an implicit guidance,to help our network capture more precise saliency features.Extensive experiments on five benchmark datasets demonstrate our model can identify salient regions of an image more effectively and enhance the robustness of the model in challenging scenarios.(2)By investigating different attention mechanisms and analyzing some challenges of the existing deep salient object detection models,we propose an effective and efficient attention network,named EEAN,for detecting salient objects by classical encoder-decoder architecture.Different from MGuid-Net,EEAN integrates feature refinement and multiscale feature fusion into the backbone network to form a stronger encoder,which simplifies the overall design of the model and reduces the computational complexity.Concretely,the EEAN encoder is composed of two core components: strong adaptive convolutional attention and gated convolutional feed-forward network.The former enhances the spatial,channel and branch adaptability of the network by introducing spatial local convolution,multi-branch structure,and channel convolution into convolutional attention,so that EEAN encoder can adaptively select the most suitable features at different network stages.The latter further improves the feature representation abilities of EEAN through the combination of spatial and channel projections.EEAN decoder adopts the channel-like attention to adaptively fuse extracted multi-level features from different stages.Extensive experimental results show that our EEAN can detect salient objects more effectively compared to most of state-of-the-art attention models.In this paper,two effective deep models,MGuid-Net and EEAN,are proposed to address the two key issues of deep salient object detection(i.e.,feature refinement and feature fusion).The MGuid-Net based on feature guidance takes full advantage of the interrelations between different features,enhancing the ability of boundary refinement and object location for network.EEAN based on attention mechanism boosts the adaptability of the model in challenging scenarios by introducing convolutional attention containing multiple information interactions.Meanwhile,a novel pixelwise contrast loss function is designed to further enhance the robustness of model.Compared with the recent deep salient object detection methods,the proposed methods show their effectiveness to a certain extent,and provide important ideas for the design of future salient object detection model. |