Font Size: a A A

Research On Image Semantic Segmentation Method Based On Context Information

Posted on:2022-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y F XiongFull Text:PDF
GTID:2518306728959479Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is a very active research direction in the field of computer vision,and its purpose is to effectively classify each pixel of the input image.It has huge application potential in autonomous driving,video surveillance,scene analysis,human-computer interaction,and behavior analysis.Compared with image classification and target detection,semantic segmentation can provide richer image semantic information.For such pixel-level classification tasks,the ability of the convolutional neural network to capture pixel contextual information directly affects the accuracy of the segmentation results.In order to obtain context information,most of the previous methods accumulate related pixels with the same weight,or only consider local context information.Therefore,this article focuses on how to construct effective context information and introduces it into the semantic segmentation model to improve the accuracy of segmentation.The main innovations of this article include the following two aspects:For the ASPP(Atrous Spatial Pyramid Pooling)multi-scale module in Deep Lab-v2,there is a problem of missing segmentation of small-scale targets and unclear segmentation of large-scale targets.Combining the idea that the sensing field size and eccentricity in the RFBNet model are positively correlated,we propose An improved multi-scale module.Based on the ASPP,this module changes the expansion rate of the 4 expansion convolution kernels to make it sensitive to both large-size targets and small-size targets.A traditional convolution layer is added to simulate the size of the receptive field in human vision,and the expanded convolution is used to simulate eccentricity.Finally,the feature map is fused with the feature information of the four scales extracted by jump connection to obtain the final multi-scale feature representation.This paper compares and analyzes the existing semantic segmentation models in three aspects: multi-scale,self-adaptation and global information guidance,and proposes an improved model on this basis.The improved model incorporates modules from three different models.We improved the global affinity module in one of the models to reduce the time complexity from O(n)to O(1).The improved model is a two-way parallel structure.The feature maps extracted through the basic network are first captured by the pyramid module in Dense ASPP(Dense Atrous Spatial Pyramid Pooling),and then input into DANet(Dual Attention Network).The attention module and the global affinity module in the improved APCNet(Adaptive Pyramid Context Network).These two parallel modules are responsible for calculating the similarity between different regions and then aggregating the features of other locations,and finally fusing the context features obtained by the two modules Get the final segmentation result.Through experiments on datasets such as Cityscapes,the network model's segmentation accuracy reached 82.1%,which is 0.5% higher than before the improvement.Compared with the model of the same nature,the speed of image segmentation has increased from 0.25 frames per second to 0.37 frames per second,which proves the effectiveness of the improved network model.
Keywords/Search Tags:semantic segmentation, contextual information, attention mechanism, multi-scale
PDF Full Text Request
Related items