Font Size: a A A

Research On Video Semantic Segmentation Algorithm Based On Deep Learning

Posted on:2020-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:S Y BianFull Text:PDF
GTID:2428330599460491Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of technology,computer vision has gradually become an important research branch in the field of artificial intelligence.In computer vision research,image-based semantic segmentation is more common.Relative to the image,the video contains more important spatio-temporal association information.Video Semantic Segmentation is designed to process continuous video sequences and utilize video inter-frame relationships for precise segmentation.It has important application value in the fields of intelligent monitoring,auto-driving,and mobile device development.This paper studies the problem of insufficient segmentation precision and excessive model size in video semantic segmentation.Firstly,considering the difficulty of segmentation of small objects in images and the inaccuracy of segmentation of objects,a multi-scale video semantic segmentation algorithm is proposed.The algorithm is based on the fully-convolutional neural network(FCN)and is based on the Visual Geometry Group Network(VGG)deep convolutional network.The image is foregrounded and background in a semi-supervised manner.The separation is achieved for the purpose of continuously conveying semantic information.The key to the algorithm is to combine online training and offline training to improve the overall segmentation accuracy of the model.In the online training phase,the image and label of the first frame of the video sequence are given and fine-tuned.For the inter-frame relationship of the video sequence,an additional Mask channel is added to the input of the network to simulate the neighboring frames trajectory information of the object motion.Aiming at the problem of sampling loss accuracy in convolutional networks,the introduction of hole convolution in deep feature networks eliminates the maximum pooling operation of the network,and uses multi-scale feature fusion connection to fuse deep features with shallow features.Accelerating the convergence speed of the network and the overall segmentation accuracy of the model,and the effectiveness and feasibility of the algorithm are verified by experiments.Secondly,considering the problem of poor applicability caused by the large size of the generated model,a lightweight video semantic segmentation model is proposed.Alightweight convolutional neural network was designed using the deeply separable convolutional network.Compared with mainstream convolutional networks,this network saves a lot of parameters and calculations,and the memory space occupied by the model is also greatly reduced.This makes the model of the deep convolution network more convenient to be applied to the actual product,and improves the utilization of the deep learning convolution network in practical applications.The effectiveness and efficiency of the algorithm are verified by experiments.
Keywords/Search Tags:video semantic segmentation, FCN model, dilated convolution, mult-scale feature fusion, depthwise separable convolution
PDF Full Text Request
Related items