Font Size: a A A

Research On Video Instance Segmentation Based On Deep Learning

Posted on:2021-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:M H DongFull Text:PDF
GTID:2518306107960619Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Video instance segmentation is a new computer vision task put forward in recent years,which extends the image instance segmentation algorithm to the video field.It includes a variety of computer vision tasks in a unified algorithm framework,including object detection,semantic segmentation,and target tracking.End-to-end detection,segmentation,and tracking of video objects can be performed in a unified framework.The existing video instance segmentation scheme adopts deep learning and has achieved good results.The current video instance segmentation methods do not deal well with the problems of blurring,occlusion and severe deformation of video targets.To deal with these problems,this thesis proposes a cascade video instance segmentation network based on temporal feature augmented module.For single-frame inference,the temporal feature augmented module calculates the similarity between the potential objects in the current frame and the objects from other frames in same video.Then this module performs feature fusion based on the similarity to assist the inference of current frame.At the same time,the cascade instance segmentation and tracking module refine the positions of detection objects through a multi-stage progressive structure.And it also generates multi-stage tracking vector simultaneously.This structure further improves the accuracy of instance segmentation and object tracking.In addition,the existing video instance segmentation algorithms use the Anchor Box mechanism.Although the Anchor mechanism excels in the field of object detection,there are still some defects,such as multiple hyperparameters and complex calculations.To deal with these problems,this thesis proposes two Anchor-Free video instance segmentation algorithms.One is a two-stage Anchor-Free network structure.First,an Anchor-Free object detection network is used to locate the video objects.Then the located objects are segmented and tracked by a segmentation branch and a tracking branch respectively.The other is a one-stage Anchor-Free network structure.By introducing branches such as full convolutional tracking embedding branch and pixel embedding branch,the instance segmentation results and tracking embedding of the video objects can be output in parallel.The two Anchor-Free solutions greatly simplify the parameter setting process and algorithm flow of video instance segmentation.And the inference speed of model can also be improved.The three proposed algorithms are evaluated on the international public video instance segmentation dataset.The experimental results show that the three proposed algorithms exceed the highest accuracy of the published work.And two Anchor-Free video instance segmentation methods further improve the inference speed of existing models.
Keywords/Search Tags:Deep learning, Video understanding, Object detection, Video instance segmentation
PDF Full Text Request
Related items