Font Size: a A A

Research On Video Object Segmentation Algorithm Based On Semi-supervised Learning

Posted on:2023-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y LiFull Text:PDF
GTID:2558306845989709Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the mobile intelligent terminals and the Internet developing dramatically in recent years,video data is growing exponentially.In order to effectively analyze and utilize the enormous data of video,effective segmentation of interest in the video is gradually becoming more and more important.Semi-supervised video object segmentation is to give a ground truth in the first frame of the video sequence,which is used as the object segmentation benchmark for subsequent frames of video.Video data has the following issues compared to static images: the Spatio-temporal domain background is more variable in natural scenes;the visual features of foreground objects in different frames may also vary and may be subject to rapid motion or occlusion;the scene may also contain noise objects that are visually or semantically similar to the object.In consequence,the complexity of the scene and the variability in the appearance features of the object bring tough challenges to the semi-supervised video object segmentation.This paper focuses on how to effectively mine the context information in the Spatiotemporal domain to tackle the above challenges,the main research contents and results of this paper are as follows:(1)A semi-supervised video object segmentation algorithm based on spatio-temporal structure consistency is proposed.For some detection based or propagation based methods are easily disturbed by background features in complex scenes,which makes the representation ability of foreground target features learned by the model insufficient,and then it is difficult to accurately separate similar target objects.Firstly,the method of combining prediction of the previous frame is proposed to extract the timing information between adjacent frames of video,by a guidance mask corresponding to the previous frame and the current frame as the input,so as to mine the temporal structure consistency of video sequence more effectively.Then,in order to make the network pay more attention to the important foreground target features,the attention mechanism is introduced.It can improve the ability of the network to express the visual features of the target object.Finally,the effectiveness and rationality of the algorithm proposed are verified through ablation and comparison experiments on the DAVIS-2016 dataset.Among them,the three evaluation indexes of segmentation region similarity J-means,contour accuracy F-means and temporal stability T are better than the baseline.(2)A semi-supervised video object segmentation algorithm integrating inter-frame context information is proposed.Aiming at the problem that most algorithms do not make full use of the context information of video sequence,a semi-supervised video object segmentation algorithm integrating inter-frame context information is proposed in this paper.By introducing the inter-frame propagation module into the network model,the algorithm calculates the similarity between the characteristics of the reference frame and the current frame.This strategy is used to model the long-term time dependence,and it can effectively suppress the background,then accurately segment the foreground object.Comparative experiments are carried out on the davis-2016 dataset to verify the effectiveness of the algorithm,in which the region similarity J mean can reach 84.2%,which is 2.7% better than that of the baseline.It is also better than the mainstream algorithms such as OSVOS,CTN,RGMP,OSMN,FEELVOS,and FAVOS.In summary,two semi-supervised video target segmentation algorithms are proposed from the perspective of fully mining the contextual information of the video spatiotemporal domain.Experimental results show that the proposed algorithm can effectively improve the segmentation performance,which can provide the necessary research basis for related high-level applications.
Keywords/Search Tags:Video Object Segmentation, Semi-supervised, Attention mechanism, Spatio-temporal structure consistency, Inter-frame context information
PDF Full Text Request
Related items