Font Size: a A A

Ranking Video Salient Object Detection

Posted on:2021-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:X Y YanFull Text:PDF
GTID:2518306548483694Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Video salient object detection aims to find objects that attract visual attention in each frame of videos.In recent years,it has attracted more and more research interest due to its wide application.However,the definition of salient objects in videos has been controversial all the time.In most previous works,video object segmentation or motion tracking datasets are used as standards,which are ambiguous to directly treat a single foreground object or a moving object as the salient object in videos.This does not conform to the rules of judgment of the visual mechanism of our human eyes.Even if a dataset specifically targeted for video saliency appeared later,its ground truth were obtained after the labeler split the video into static frames,ignoring important temporal information in videos.Moreover,all video saliency works so far only distinguishes objects into salient or non-salient,which could not reflect the saliency information in videos adequately.To address above problems,we made the following three aspects of work:(1)We propose a completely new definition for the salient objects in videos--ranking salient objects,which uses the ratio of the number of eye fixation points landed on each object in each frame to measure the relative saliency rank among objects.With this concept,the saliency of multiple objects in the same frame will be different,and the saliency of the same object in different frames of the video will no longer be static,but will change according to context information.Based on this definition,we construct a ranking video salient object dataset(RVSOD).In this dataset,different gray values are used to reflect different saliency rankings of objects,which provides a standard for our new definition.(2)Based on this dataset,this paper has proposed two video saliency algorithms based on deep learning.The first algorithm is a traditional video salient object detection algorithm based on spatio-temporal feature extraction.Multi-scale feature extraction module and bidirectional Conv LSTM module are designed to extract saliency information in videos.The second algorithm constructs a two-branch multi-task network,in which eye fixation points are added to assist the detection of ranking video salient objects.(3)The method for traditional video salient object detection outperforms the state-of-the-art approaches on multiple standard datasets.Furthermore,we hope our approach for ranking video salient detection will serve as a baseline and lead to a conceptually new research in the field of video saliency.
Keywords/Search Tags:Computer Vision, Video Saliency, Deep Learning, Multi-task Network
PDF Full Text Request
Related items