Font Size: a A A

Research On Video Object Segmentation Method Based On High-order Energy

Posted on:2022-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhengFull Text:PDF
GTID:2518306539953259Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and computer technology,the data stored in the form of video has become one of the major carrier of information transmission,video contains more abundant information than the image,more expressive force and rendering,which not only contains static spatial information,there are a lot of sports information changes over time.Therefore,the processing and analysis of video data is a very important research field,and the video object segmentation studied in this thesis is one of the important research directions.Video object segmentation means that it is necessary to specify the object to be segmented in advance(usually the segmentation mask of the first frame of the video sequence),and then implement the segmentation of the designated object in the remaining frames of the video according to the guidance algorithm of the mask of the first frame.This technology plays a very important role in many fields,for example : video effects,Intelligent security,driver assistant system,intelligent camera,etc.Therefore,it has high application and research value.The purpose of this thesis is to construct a robust model for video segmentation based on the global information of the whole video.In particular,this thesis proposes an effective solution to the problem that most algorithms are not robust enough in the face of the inherent difficulties of video object segmentation,such as object appearance change,object occlusion,insufficient training data,object disappearance and reappearance.The main contributions of this study are as follows:(1)In this study,a Markov random field is used to model the video object segmentation task,and the video object segmentation problem is transformed into the node labeling problem in the Markov spatio-temporal graph model.The high-order energy function is modeled to solve the node labeling problem.By minimizing the energy function,the object segmentation results of the video sequence are obtained.(2)This study uses the advantage of text classification idea in semantic classification to model the high-order dependence of pixel points,Specifically,we use text classification idea to model the high-order terms of energy function to enhance the global consistency of object segmentation.Experimental results show that,under the constraints of the higher order energy,the model is significantly improved in robustness and accuracy,and has certain competitiveness in comparison with the other algorithms.(3)Aiming at the scarcity of supervised data in video target segmentation,a deep visual dictionary algorithm based on the idea of meta-learning was proposed.The reason why the video object segmentation task cannot directly transfer the semantic segmentation model is that the segmentation object cannot be trained in advance.Therefore,the meta-learning idea is applied to the video object segmentation task,that is,through the training of a large number of similar tasks,the generalization ability of the model is improved.Experimental results show the effectiveness of the proposed algorithm.(4)Based on the idea of mask learning,this thesis models the high-order dependencies of pixels to enhance the robustness of the segmentation model.Compared with the high-order term modeling method based on traditional features proposed in(2),the high-order term modeling method based on mask learning has higher robustness and richer prior information.Experimental results on multiple datasets(Davis-2016 and You Tube)show that the proposed method(4)has significant improvements in robustness and accuracy compared with the proposed method(2)and achieves competitive performance in comparative experiments with the other algorithms.In order to verify the efficiency and effectiveness of the model proposed in this study,qualitative and quantitative evaluations were conducted on multiple datasets(Davis-2016 and You Tube).Firstly,ablation study was conducted to analyze the performance of the model with different parameters.The experimental results show that the proposed high-order term constraint significantly improves the accuracy and robustness of the model.Then,on the Davis-2016 and You Tube datasets,the algorithm in this thesis was compared with the other algorithms in the field.Experimental results show that the proposed algorithm achieves competitive results in both accuracy and robustness,and the most advanced results can be obtained in some very challenging segmentation scenarios.
Keywords/Search Tags:Video Object Segmentation, Markov Random Field, High-Order Energy, Text Classification, Mask Learning
PDF Full Text Request
Related items