Font Size: a A A

A Researh On Deep Convolutional Neural Network Based Semantic Segmentation Method

Posted on:2022-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:P H LiFull Text:PDF
GTID:2518306500956919Subject:Physical Electronics
Abstract/Summary:PDF Full Text Request
Semantic segmentation is the classification of each pixel in the image.It is a very active research topic in the field of computer vision.It has a wide range of applications in robot perception,autonomous driving,video surveillance and scene understanding.In recent years,due to the effectiveness of deep learning methods in various visual tasks,a lot of work has been devoted to using convolutional neural networks to build semantic segmentation models.However recently,semantic segmentation models generally focus on improving segmentation accuracy,which leads to the problems of high computational complexity and large memory usage,and it was difficult to deploy on embedded platforms with limited hardware storage and computing power.Therefore,based on the study of semantic segmentation of convolutional neural network,this thesis proposes a semantic segmentation method with both segmentation accuracy and efficiency from the perspective of model design.The main contents of this thesis are as follows:(1)Conducted a lot of research on the current outstanding semantic segmentation algorithms,and analyzed the structural characteristics of related convolutional neural networks,including atrous spatial pyramid pooling(ASPP),depthwise separable convolution,and lightweight convolutional neural networks.In this thesis,based on the semantic segmentation method of the encoder-decder architecture,combined with the Dilated Mobile Net V2,the decoding structure is optimized.So as to achieve a lightweight semantic segmentation Mobile Net V2 Deep Lab V3+ network model.(2)Starting from more effective extraction of high-level semantic features,this thesis proposes a new parallel multi-branch module for the ASPP's lack of awareness of anisotropic context.This module is a parallel mix strip pooling(MSP)in ASPP,which performs multi-scale feature extraction and fusion,and strengthens the long-distance dependence between pixels,thereby capturing richer context information.Taking Mobile Net V2 Deep Lab V3+ as the baseline model,this method has achieved an accuracy improvement of 1.07% mean-Intersection-over-Union(m Io U)on the PASCAL VOC 2012 dataset,and the amount of additional model parameters and calculations is very small.(3)In order to restore a clearer target boundary.The channel attention mechanism is introduced in the decoding part of the model to mine more useful feature channel information and effectively improve the fusion of low-level features and high-level features.On the PASCAL VOC 2012 dataset,compared with the benchmark model,this method achieves a 0.82% increase in the accuracy of m Io U without increasing the amount of parameters and calculations.After applying the depthwise separable convolution to the ASPP and the compression model of the decoder,the network parameters of the final model are 4.5M,the floating-point-operations(FLOPs)is11.13 G,and the m Io U is 72.07%.The results show that the new model obtains a better semantic segmentation effect while consuming very little memory and computing resources,and meets the requirements for deployment on embedded platforms.Finally,in the urban street scene experiment,the model also achieved good segmentation results.
Keywords/Search Tags:Semantic Segmentation, Convolutional Nerual Network, Encoder-Decoder, Depthwise Separable Convolution, Strip Pooling
PDF Full Text Request
Related items