Font Size: a A A

Video Object Segmentation Based On Deep Learning

Posted on:2022-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:W J LiFull Text:PDF
GTID:2518306524485564Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Video object segmentation aims to separate foreground objects and background pixels in the video,and assign binary labels to the foreground and background pixels.In recent years,video object segmentation has been widely used in various fields.At the same time,more and more excellent algorithms have been proposed.Based on the semi-supervised setting,video object segmentation can be divided into two categories:matching-based and propagation-based.As for the matching-based video object segmentation,the label of the first frame is corresponding to the pixel of the current frame by the similarity measurement strategy.While the propagation-based video object segmentation methods learn the deformation from the previous frame mask to the current frame through the network.However,although the label information of the first frame is used as a reference,the video object segmentation based on mask matching and mask propagation are both affected to varying degrees by object deformation,object occlusion or lighting changes,etc.,and the similarity between the current frame and the first frame or the last frame cannot be guaranteed.Therefore,they usually perform poorly in segmentation of long-term videos.In this regard,on the premise of carrying out a lot of theoretical research,this thesis proposes theoretical innovations based on the advantages and disadvantages of mask matching and mask propagation algorithms,and makes corresponding improvements for the application of video object segmentation in airport ground surveillance scenes.The main work of this thesis is as follows:1.Propose a new semi-supervised video object segmentation algorithm.Combining the advantages of mask matching and mask propagation algorithms,the network not only rely on the manual annotation of the first frame or the prediction result of the previous frame.Without any online fine-tuning,the problem of error accumulation caused by the mask propagation process is reduced.In order to verify the effectiveness of the algorithm in this thesis,experiments are carried out on the DAVIS dataset.The experiments prove that the algorithm has robustness in the process of video propagation.2.A key frame extraction strategy is designed.This thesis proposes a mask ranking module,which dynamically selects the most instructive mask to guide segmentation in the middle stage of mask propagation.Through the mask ranking module,the network effectively uses the long-term information in the video,which can avoid the absoluteness brought by the hard classification based on the matching method.Comparing with the propagation-based methods,which without any adjustment strategy,our methd are more flexible.3.Considering the particularity of the airport environment,the airport scene prior information ADS-B is used to solve the problem of airport ground surveillance video object segmentation.Through the position information of ADS-B,the problem of temporal and spatial misalignment of targets under the airport scene can be solved respectively.The effectiveness of the algorithm is verified on the airport ground surveillance video dataset AGVS.Compared with the results of other existing video object segmentation algorithms,the experimental results show that the using ADS-B position prior information can effectively improve the airport ground surveillance video segmentation effect.
Keywords/Search Tags:deep learning, semi-supervised video object segmentation, mask matching, mask propagation, encoder-decoder network
PDF Full Text Request
Related items