Research On Video Object Segmentation Based On Deep Learning

Posted on:2024-01-29

Degree:Master

Type:Thesis

Country:China

Candidate:S S Han

Full Text:PDF

GTID:2568307133461954

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the advances of deep learning in the field of computer vision,deep learning-based video object segmentation methods have undergone unprecedented development.As one of the important tasks in the field of computer vision,video object segmentation aims to annotate each frame of a video sequence at the pixel level,assigning each pixel to its corresponding object category,thereby achieving accurate extraction and tracking of each object in the video.The difficulty of video object segmentation lies in the challenges posed by changes in object appearance,object similarity,object occlusion,and object movement.At the same time,algorithm efficiency and accuracy must be considered to strike a balance between real-time performance and accuracy,in order to achieve efficient object segmentation.Existing video object segmentation methods still suffer from the drawbacks of slow processing speed and insufficient accuracy,making it significant to design precise and efficient video object segmentation methods.The dissertation analyzes and studies deep learning-based video object segmentation methods starting from the segmentation challenges.(1)A video object segmentation method based on a U-shaped network architecture is proposed to address the limitations of the One-Shot Video Object Segmentation(OSVOS)algorithm,which struggles with scenes featuring object appearance changes and similarities.This method establishes correlations between feature maps using attention mechanisms to improve the model’s global semantic information.During training,imbalanced positive and negative samples can lead to inaccurate predictions,and this issue is resolved by optimizing the loss function.Due to the correlation between pixels,segmentation results often have rough edges.To address this,the dissertation applies a fully connected conditional random field to post-process the multi-scale prediction results,which effectively improves the accuracy of boundary segmentation.(2)The Separable Structure Modeling for Semi-Supervised Video Object Segmentation(SSMVOS)has weak modeling capabilities and cannot effectively segment occluded and fastmoving objects.To address this issue,this dissertation proposes a video object segmentation method based on a hybrid encoder of Convolutional Neural Networks(CNN)and Transformer.The proposed method associates the global convolutional module with Transformer,which not only alleviates low-resolution loss but also better models long-term dependencies and global information in the sequence.Additionally,the boundaries in low-resolution images are usually blurry,and this dissertation proposes an attention feature fusion boundary refinement module to accurately locate the boundaries.The proposed method has the dual advantages of Transformer and CNN and has made significant progress in solving segmentation problems such as occlusion and fast movement.

Keywords/Search Tags:

video object segmentation, global semantic features, attention mechanism, feature fusion, Transformer encoder

PDF Full Text Request

Related items

1	Research On Image Semantic Segmentation Based On Deep Network
2	Research On Real-time Semantic Segmentation Based On Contextual Feature Aggregation Learning
3	Image Semantic Segmentation Based On Self-attention Mechanism And Encoding-decoding Network
4	Research On Image Semantic Segmentation Algorithm Based On Encoder-decoder Structure
5	Image Semantic Segmentation Algorithms Based On Feature Fusion And Non-local Features
6	Image Semantic Segmentation Method Based On Attention Mechanism
7	Research On Image Semantic Segmentation Based On Attention Mechanism And Feature Fusion
8	Image Semantic Segmentation Algorithm Based On Deep Learning And Attention Mechanism
9	Image Semantic Segmentation Method Based On Attention Mechanism
10	Research On Object Detection And Segmentation Algorithms Based On Dynamic Loss And Attention Mechanism