Font Size: a A A

Research On Video Object Segmentation Based On Supervoxel Pooling

Posted on:2020-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2428330623455811Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the continuous development of modern information technology,Multimedia information has become the main source of human access to external information.As the data form with rich and diverse visual information,video is one of the carriers of multimedia information transmission.Recently,video shows an explosive growth trend,which takes great challenges for video process and analysis.In order to meet the needs of large-scale video data processing and analysis,it is necessary to process video data effectively and extract the information of interest of human beings.In the video,what we most interested in is the object,so the object can be extracted from videos to help process and analyze videos and video object segmentation comes into being.The task of video object segmentation is to separate the object from each frame on the pixel-level along the entire video sequence and acquire the accurate boundary of the object.As a basic research in computer vision,video object segmentation can not only provide research basis for behavior analysis,video retrieval,attitude estimation and video summary,but also can be widely used in automatic driving,intelligent monitoring,augmented reality and many other fields.Because of the important academic research value and practical application value,lots of research institutions and scholars have conducted in-depth exploration.Video object segmentation has always been a challenging task because of the diversity of objects,complex backgrounds,occlusions,and poor shooting conditions.Mainstream segmentation methods pay more attention to semantic information of objects in video,but ignore spatio-temporal structure information of video sequence.However,video object segmentation in the complex scene,especially there are non-foreground regions similar to the object,will lead to inaccurate segmentation on the boundary with weak discriminability.Because of the negligence of spatio-temporal structure information of video,the accuracy and robustness of the video object segmentation in natural scenes will be affected.In order to solve the above problems,we focus on how to get better segmentation on boundaries with weak discriminability.Our work are as following:(1)A video supervoxel extraction method based on boundary enhancing is proposed.The proposed method makes use of the boundaryness of supervoxels,and enhances the possibility of selecting supervoxels at the boundary of the object,so that the results of supervoxels extraction have better boundary information and improve the accuracy of supervoxels extraction.The experimental results on SegTrack show that the proposed supervoxel extraction algorithm can improve the effect of boundary segmentation of supervoxel to a certain extent.(2)A video object segmentation based on supervoxel pooling is proposed.To preserve the spatio-temproal structure of object in video,supervoxel is integrated into CNN-based segmentation model.Meanwhile,the supervoxel feature is fused with CNN feature of video to improve the segmentation performance in object boundary and increase the overall segmentation accuracy.Compared with other segmentation methods,the proposed method lifts the Contour accuracy(F)and region similarity(J)by1.1% and 1.4% respectively on DAVIS 2016 dataset.
Keywords/Search Tags:Video Object Segmentation(VOS), Boundary enhancement, Supervoxel pooling, Feature fusion, Convolutional Neural Networks(CNNs)
PDF Full Text Request
Related items