Font Size: a A A

Research On Fully Supervised Saliency Detection And Weakly Supervised Saliency Detection

Posted on:2021-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:H S ZhangFull Text:PDF
GTID:2428330611951606Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Saliency detection aims to simulate the human visual system to highlight the most significant object in the image.Taken as a pre-processing step,saliency detection effectively helps digital image processing and plays an important role in many practical applications.In recent years,driven by the remarkable success of deep convolution neural network,saliency detection based on deep convolution neural network has made a lot of valuable progress.However,salient object detection is still a problem worthy of further study.On the one hand,most of the existing algorithms improve the performance by aggregating the multi-scale convolution features.How to extract the effective features reasonably and obtain the clear boundary of the saliency target remains key problems.On the other hand,saliency detection algorithms based on deep convolution neural network usually needs abundant data with pixellevel annotation for training.Since it is a waste of time and labor to annotate images with pixellevel ground-truth,researchers attempt to exploit higher-level supervision to train the deep convolution neural network.It is challenging to cut the salient objects accurately in weak supervision settings.To solve the first problem,we propose a prediction-optimization framework to detect salient objects.It uses pixel-level annotations to supervised the predicted saliency maps.The algorithm consists of a saliency prediction network and a saliency optimization network.The saliency prediction network estimates the saliencies by combining the low-level structural features and the high-level context information.The high-level feature helps locate the salient areas,while the low-level feature captures the detailed information for correct saliency prediction.The saliency optimization network is based on the encoder-decoder structure.It optimizes the saliency maps by learning the residual between the saliency prediction and the labeled ground-truth.The proposed algorithm gradually generates saliency maps with highquality boundaries.To solve the second problem,we propose a two-stage learning framework to extract salient objects from a variety of weakly supervised annotations.In the first stage,we design a classification network and caption generation network for category classification and caption generation respectively,while highlighting the potential significant regions.In the second stage,we create two complementary training datasets,including the natural image dataset with noise and the synthesized Web image dataset.The images are pixel-level annotated using the classification network and caption generation network.The natural image dataset makes the saliency prediction network adapt to the natural image input.The synthesized Web image dataset provides accurate pixel-level ground-truth for the saliency prediction network.The two algorithms proposed in this paper,both the fully supervised saliency detection algorithms and the weakly supervised saliency detection algorithms,are tested and evaluated on multiple public saliency datasets.They are quantitatively and qualitatively compared with a variety of state-of-the-art algorithms.Experiments show the superiority of the proposed algorithms.
Keywords/Search Tags:Salient Object Detection, Full Supervision, Weak Supervision, Multi-label
PDF Full Text Request
Related items