Image Saliency Detection And Application Based On U-shaped Network And Attention Mechanism

Posted on:2024-01-28

Degree:Master

Type:Thesis

Country:China

Candidate:T Wu

Full Text:PDF

GTID:2568307082979829

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Saliency object detection aims to extract the most visually distinctive objects or regions in a scene by using computer to simulate the human visual attention mechanism.It is an important topic in the field of computer vision,which can be applied to a variety of downstream tasks and realistic scenario.Based on summarizing and analyzing current research status of saliency detection algorithms and deep learning models,this thesis explores and implements two image saliency detection algorithms combining U-shaped network and attention mechanism,and designs two applications of saliency detection models.1.Double U-shaped Progressive Refinement Network Based on Hierarchical Feature Processing and Attention Mechanism(DUPRNet)Based on the U-shaped network,a Double U-shaped Progressive Refinement Network Based on Hierarchical Feature Processing and Attention Mechanism(DUPRNet)is proposed to implement saliency detection of color input image.In the encoder stage,the network uses Res2Net-50 network as the backbone network for feature extraction to extract fine-grained and multi-level features in a bottom-up way.The Pyramid Pooling Module(PPM)is used to fully mine the high-level semantic information of the highest features of the feature extraction network and provide semantic guidance for detecting salient object locations.The lateral output of each encoder layer in the feature extraction backbone network is enhanced hierarchically.The shallow features integrate context information through the Recurrent Criss-Cross Attention Module(RCCA),and the deep features are enhanced by multi-scale interaction of high-level semantic features based on Inverse Residual Convolution(IRC).In the decoder stage,a top-down and progressively refining feature decoding process is used to integrate the enhanced features from the encoder at all levels and the semantic guidance information from the Pyramid Pooling Module(PPM),and the Progressive Refinement Module(PRM)based on foreground-background attention refines the saliency prediction results layer by layer.According to the network architecture,the feature extraction backbone network,the Pyramid Pooling Module,and the progressive refinement process based on foregroundbackground attention form the "outer U-shaped structure" of the network.The hierarchical feature enhancement process and the top-down feature fusion process form the "inner U-shaped structure" of the network.The two-part U-shaped structures form together the network a double U-shaped structure.The effectiveness of the proposed DUPRNet algorithm for saliency detection is verified by performing comparison experiments with seven excellent saliency detection algorithms on four publicly available datasets including ECSSD,PASCAL-S,DUT-OMRON and DUTS-test.2.Lightweight U-shaped Saliency Detection Network Based on Multi-scale Attention(LUMANet)Taking U-shaped network as the basic architecture,a Lightweight U-shaped Saliency Detection Network based on Multi-scale Attention(LUMANet)is proposed.In the encoder stage,Res2Net-50 network is used as the feature extraction backbone network to extract hierarchical fine-grained features.In order to implement a reduced-parameter model,the lateral outputs of the backbone network are firstly channel-reduced and refined locally by convolution.According to the characteristics of different levels of features,the Multi-scale Context Spatial Attention Module(MCSA)is proposed to process feature maps from shallow layers,the propose of which is to make full use of local context information from shallow features and to enhance the local features.For processing feature maps from deeper layers,a Multi-scale Semantic Channel Attention Module(MSCA)is proposed to fuse of multi-scale local context information and to globally enhance channel attention features.In the decoder stage,feature fusion is carried out by using densely connection and top-down layer-by-layer refinement to achieve parameter reuse and full-scale feature fusion.In the first two layers of decoder stage near the output of the network,the Progressive Refinement Module(PRM)is used to enhance the salient features,and then the prediction map with clear boundaries and complete semantic structure is obtained.The total number of parameters of the model is only 19.06 M.The effectiveness of the proposed LUMANet algorithm for saliency detection is verified by performing comparison experiments with seven excellent saliency detection algorithms on four publicly available datasets including ECSSD,PASCAL-S,DUT-OMRON and DUTS-test.3.Applications of saliency detection modelsTwo applications based on saliency detection are designed,named foreground extraction and background transformation based on saliency detection and image retrieval based on saliency object detection respectively.In the application of foreground extraction and background transformation based on saliency detection,the core step is to use saliency detection to generate saliency map,and the foreground mask and background mask are obtained by saliency map binarization,which will be used to generate foreground map from input image and background map from target image,respectively.Then the final output image is generated by integrating the foreground map and the background map.The proposed model is tested on this application scenario to verify the role of the saliency detection model and the effect of the performance of the saliency detection model on the foreground extraction and background transformation.In the application of image retrieval,saliency detection model is embedded in the preprocessing stage of content-based image retrieval,which extracts the features and calculates the similarity of the generated salient regions to narrow the range of feature extraction and improve the performance of image retrieval.Through experiments,the salient object detectionbased image retrieval and content-based image retrieval are compared and analyzed to verify the role and application value of salient detection in image retrieval.

Keywords/Search Tags:

Saliency detection, U-shaped network, Attention mechanism, Progressive refinement network, Multi-scale attention

PDF Full Text Request

Related items

1	Image Saliency Detection Based On Visual Attention Mechanism
2	Deep Learning Based Visual Saliency Detection
3	Multi-scale Features Fusion Network For Salient Object Detection
4	Research On Image Super-Resolution Network Based On Attention Mechanism
5	Multi-granularity Feature Representation Network Based Camouflaged Object Detection And Its Application
6	Target Detection And Classification Of High Resolution SAR Images With Multi-scale Deep Network And Visual Attention Mechanism
7	Video Saliency Detection Method Based On Visual Attention Mechanism
8	Research On Panoramic Segmentation Network Based On Feature Enhancement
9	Video Saliency Detection Based On Improved Attention Network And Data Augmentation
10	Research On Saliency Object Detection Algorithm Based On Feature Fusion And Attention Mechanism