Deep Learning-based Video Segmentation Via Multiple Granularity Analysis

Posted on:2019-04-12

Degree:Master

Type:Thesis

Country:China

Candidate:R Yang

Full Text:PDF

GTID:2428330590492341

Subject:Major in Electronic and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Video segmentation aims at separating target objects of interest from noisy background,and has received considerable attention with a wide range of computer vision applications,such as 3D reconstruction,video summarization,etc.Numerous algorithms have been proposed during the past decade with focus on developing graphical models,e.g.,Markov Random Field?MRF?,and Conditional Random Field?CRF?,to estimate target motions for each pixel?optical flow?or superpixel.Despite their favorable performance in several datasets,video segmentation still faces two main challenges.First,when graphical models are leveraged to compute tempo-ral consistency in the pixel or superpixel level,there often exist mismatching pairs between consecutive frames.For example,the supervoxel algorithm models the temporal consistency using superpixels for each frame.The inaccuracy caused by the mismatching of superpixels is inevitably aggregated frame by frame,and finally leads video segmentation algorithms to fail.We also note that developing a superpixel model across several frames is computationally ineffi-cient.Second,object level motions estimated by visual tracking algorithms often contain noisy background as tracking results in the form of bounding boxes are not tightly around target ob-jects.Video segmentation benefits little from the recent progress of visual tracking algorithms.To address these challenges,we present a novel framework of applying the multiple in-stance learning?MIL?algorithm to both spatial and temporal domains for video segmentation.In contrast to most machine learning algorithms that assign every training instance with a label,MIL assigns bags of instances with labels.In the binary case,a bag is labeled positive if at least one instance in that bag is positive,and the bag is labeled negative if all the instances in it are negative.MIL is able to classify instances with missing or noisy labels based on the labeled bags as training data.This motivates us to apply the MIL algorithm to compute the temporal consis-tency in the temporal domain.For example,temporal adjacent and similar superpixels always belongs to the same label?i.e,foreground or background?,since motion between consecutive frames can not be too significant.On the other hand,object level motions estimated by visual tracking algorithms in the form of bounding boxes provide rich information for the video seg-mentation task despite partial noisy background inside bounding boxes.Built on state-of-the-art tracking algorithms,we properly enlarge the tracked bounding boxes to meet the requirement of applying MIL.We find that MIL deals with the noisy background well and provides an accurate envelop of the true foreground object masks.This significantly facilitates video segmentation.We can regard the proposed method as a multi-granularity framework for video segmen-tation problem which can effectively segment target objects from the background in a coarse to fine fashion.In the coarsest level?object?,off-the-shelf object tracker is applied to the whole video sequence,yielding a candidate volume of object bounding boxes.In the middle level?superpixel?,we perform multiple instance learning within the candidate volume to obtain a coarse segmentation result.In the finest level?pixel?,segmentation mask is further refined via graph cut like algorithm.We comprehensively evaluate our algorithm on two popular video segmentation datasets,the Segtrack 2.0^[2]and Davis Dataset^[1]released in CVPR 2016.The results demonstrate the superiority of our video segmentation method over the state-of-the-art algorithms.

Keywords/Search Tags:

Video Segmentation, MIL, Deep Learning, Multiple Granularity Analysis

PDF Full Text Request

Related items

1	System For Medical Image Analysis Using Deep Learning And Its Practice In Segmentation Of Gastroscope Video
2	Research On Several Problems Of Video Semantic Analysis
3	Research On Real-time Video Segmentation Technology Based On Deep Learning
4	Research On Scalable Video Coding
5	Research On Video Instance Segmentation Based On Deep Learning
6	Design And Implementation Of Sports Videos Analysis System Based On Deep Learning
7	Research On Image Multi-Granularity Referring Analysis
8	Research On Deep Learning Based Video Classification Technologies
9	Research And Implementation Of Person Re-identification System Based On Deep Learning
10	Video Object Segmentation Based On Deep Learning