Font Size: a A A

Salient Object Detection And Application Based On Pyramid Vision Transformer Gated Network

Posted on:2024-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:X L ZhouFull Text:PDF
GTID:2568307082479834Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In the modern society with the rapid development of the new media industry,image and video data acquisition equipment spread all over the world,and visual data has become larger and more complex.In the face of such a large amount of data,to quickly and efficiently analyze and find out the key information,we need to use computer resources to do assistance.The limited computing resources are preferentially allocated to the important areas of the image,the important content is extracted and segmented from the digital image information,a large number of unimportant background information is eliminated,and the burden of collecting useful information is reduced.This work is called salient object detection.Significance object detection improves the efficiency of information processing and reduces the amount of computation.In recent years,the application demand of computer vision keeps rising.Further exploration of salient object detection algorithm has important theoretical research significance and engineering application value for the development of computer vision.The main work of this thesis is as follows:(1)Salient Object Detection Based On Pyramid Vision Transformer Gated Network(PVT-Gate)Recently,vision transformers started to show impressive results which outperform large convolution based models significantly.In this work,a gated network based on Pyramid Vision Transformer is proposed for significance target detection.This thesis explores Pyramid Vision Transformer(PVT)as a backbone network to learn global and local representations and its self-attention mechanism.To restore more details of the saliency map,a multistage gating unit can be used to build the cooperation among different levels of features and improve the discriminability of the whole network.With the help of multilevel gate units,the valuable context information from the encoder can be optimally transmitted to the decoder.The pyramid pooling module(PPM)collects high-level semantic information.Moreover,the semantic information of each level is integrated and decoded by the feature aggregation decoder(FAD).Experimental results on five challenging benchmark datasets show that the method obtains superior performance on all four evaluation indexes than the current advanced methods without any preprocessing/post-processing.(2)Application of intelligent Matting system based on salient object detectionIn order to further reflect the practical effectiveness of saliency object detection,this thesis combines the saliency object detection algorithm based on Pyramid Vision Transformer(PVT)gated network with the matting algorithm to achieve an intelligent matting system based on saliency object detection.The salient object detection algorithm based on Pyramid Vision Transformer gated network automatically recognizes the most visually attractive objects in the image,and segments them accurately to obtain the front background separation mask layout.Then enter the matting part,calculate the mask layout and the original image matting to generate the final matting result with background overlay and foreground visibility.In the process of system operation,it does not rely on the three-part graph at all,so as to achieve the real sense of automatic matting from end to end.It overcomes the problem that traditional matting system needs manual annotation and improves the matting performance.
Keywords/Search Tags:Salient Object Detection, Pyramid Vision Transformer, Gated Unit, Intelligent Matting
PDF Full Text Request
Related items