Research On Semi-supervised Video Object Segmentation Via Pyramid Network Modulation

Posted on:2021-02-25

Degree:Master

Type:Thesis

Country:China

Candidate:S H Jiang

Full Text:PDF

GTID:2428330647452391

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

Video object segmentation aims to extract interesting objects from complex video scenes and segment them quickly and accurately.However,in the actual environment,video object segmentation still faces difficulties caused by many external interference factors,especially when multiple similar targets coexist,video object segmentation may be more challenging.Therefore,in order to deal with the problem of single object and multi-object segmentation in complex video scenes,this paper proposes a semi-supervised video object segmentation algorithm based on pyramid network modulation.The main research work is as follows:Aiming at complex video scenes such as target scale change and color unevenness,a semi-supervised video single object segmentation algorithm based on pyramid pooling network modulation was proposed.First,a one-way transmission of the modulation network was used to make the segmentation model adapt to the appearance characteristics of a given object,which means,a modulator was learned based on the visual and spatial information of target object to modulate the intermediate layers of segmentation network to make the network adapt to the appearance changes and displacement information of specific object.Secondly,global context information was aggregated in the last layer of the segmentation network through the multi-region context fusion method.Finally,feature information of the high and low layers of segmentation network was directly integrated to make up for the lack of target details in the last layer of segmentation network.The proposed semi-supervised video object segmentation method is a network which is able to be trained end-to-end.Extensive experimental results show that the performance of the proposed method on the DAVIS 2016 and DAVIS 2017 datasets can achieve competitive results compared with the more advanced methods using online fine-tuning and run on a single GPU at a speed of 0.14 s per frame.Aiming at the problem that the segmentation results of multiple objects in video are not obvious,this paper further proposes a semi-supervised video multi-object segmentation algorithm based on dual pyramid network modulation.In this paper,the idea of gradual fusion of high-level and low-level feature information was added after the last layer of segmentation network.Specifically,high-level semantic feature maps were constructed at all scales through horizontally connected left-to-right structures to fully integrate target location and detail information in low-level features and strong semantic information in high-level features to achieve the purpose of improving the segmentation results.Experiments show that the proposed method effectively improves the segmentation accuracy of multiple objects in the above method,and the segmentation accuracy on the DAVIS 2016 and DAVIS 2017 datasets has increased by 0.9 percentage points and 2 percentage points,respectively.In addition,this paper also studies the real-time problem of the algorithm.By adding large-scale data training and using a lightweight network,the segmentation model can run at a speed of 0.06 s per frame on a single GPU,which increases the practicality of the segmentation algorithm.

Keywords/Search Tags:

Video object segmentation, Pyramid pooling model, Multi-scale fusion, Full convolutional network, Deep learning

PDF Full Text Request

Related items

1	Research On General Object Detection Method Based On Deep Learning
2	Research On Video Object Segmentation Based On Supervoxel Pooling
3	Research And Application On Semantic Segmentation Based On Multi-scale Contex Information
4	Images Semantic Segmentation Based On Deep Convolutional Neural Networks
5	Research On RGB-D Object Recognition Based Deep Learning Algorithms
6	Domain Adaptation For Semantic Segmentation
7	Research On Multi-scale Object Detection Algorithms Based On Deep Neural Network
8	3D Object Recognition Algorithm Based On Deep Learning
9	Video Object Detection Based On Attention Mechanism And Multi-Scale Feature Fusion Convolutional Network
10	Research On Kidney Segmentation Of CT Images Based On Deep Learning