Font Size: a A A

Research And Application Of Spatio-Temporal Context In Semi-Supervised Video Object Segmentation

Posted on:2020-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y K ZhangFull Text:PDF
GTID:2428330590452374Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the popularity of intelligent hardware devices,video has become an important online and offline information communication carrier.The generation and sharing of massive video information put forward higher requirements for video processing algorithm.As one of the key tasks of video processing,video object segmentation technology is the foundation of high-level computer vision application.But the existing video object segmentation algorithm is not mature.Some algorithms ignore the vital spatio-temporal context information and cannot achieve satisfactory segmentation accuracy.Some algorithms extract spatio-temporal context information by using pre-trained models such as optical flow,which can achieve satisfactory segmentation accuracy but consume too much computing resources.These algorithms can't meet the needs of practical applications.To address above problems,this work studies the semisupervised video object segmentation algorithm with spatio-temporal context information.With the help of deep learning,we studied and realized the video object segmentation algorithm with weakly temporal information and the lightweight video object segmentation algorithm based on ConvGRU.The main work of this thesis is as follows:(1)Aiming at the problem of ignoring spatio-temporal context information in video object segmentation,this thesis proposes a two-branch convolutional neural network with appearance and temporal to introduce spatio-temporal context information into the traditional video object segmentation algorithm.The appearance branch adopts the basic framework of video object segmentation algorithm without spatio-temporal context information.The extended temporal branch takes the mask of previous frame as input,and the weakly temporal information between adjacent video frames is extracted to guide the segmentation of current frame.In addition,the multimask guidance method is introduced to enhance the stability of proposed algorithm.And through the method of further training,the proposed algorithm has the ability to improve the segmentation accuracy constantly by new knowledge.Experiments show that this method can introduce spatio-temporal context information into video object segmentation in the way of save computing resources and it works.(2)Aiming at the problem of computing complex and suboptimal solutions that caused by introducing spatio-temporal context information with the pre-training model such as optical flow,we start from the essence of the video object segmentation task witch introduces the spatio-temporal context,and propose an end-to-end video object segmentation algorithm that combines convolutional neural networks and recurrent neural networks to deal with “sequence-to-sequence” computer vision tasks.Convolutional neural network is used to extract the image features,and recursive neural network is used to extract the spatio-temporal context information.The ConvGRU is used to achieve the deep fusion of visual features and spatial-temporal context information.The MobileNet-based lightweight algorithm can meet the demand of practical application and solve the problem of high consumption for computing resources.Experiments show that proposed algorithm can effectively fuse visual features and spatio-temporal context information to improve the accuracy of video object segmentation.(3)Aiming at the practical application of video object segmentation algorithm,combined with the study results of this thesis,we designed and implemented video object segmentation prototype system based on the video object segmentation algorithm with weakly temporal information and the ConvGRU-based lightweight video object segmentation algorithm.
Keywords/Search Tags:video object segmentation, spatio-temporal context, two-branch framework, deep learning, lightweight model, ConvGRU
PDF Full Text Request
Related items