Font Size: a A A

Image Semantic Segmentation Based On Global Convolutional Neural Network

Posted on:2021-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:S LiuFull Text:PDF
GTID:2428330620961349Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is the basis of image understanding.The basic idea is to classify various types of objects contained in an image one by one,and mark pixels that belong to the same category as the same color.The emergence of image semantic segmentation can help computers better understand the content expressed in images.This technology is widely used in true three-dimensional display,driverless and assisted medical fields.At present,the method of image semantic segmentation based on deep learning is developing rapidly,but there are still many challenges.For example,the scene image to be segmented is extremely susceptible to different light intensities and category diversity.The existing methods do not affect the geometric features in the image.Obvious and complex textures still have difficulty in segmentation.Specifically,in the scene segmentation task,the difference in pixel values between different objects in the image is too large or too small,resulting in over-segmentation and under-segmentation.Therefore,in view of the above problems,this thesis makes in-depth research on image semantic segmentation methods,and proposes two new types of image semantic segmentation networks.(1)Image semantic segmentation based on multi-scale convolutional neural networkGenerally,the existing image semantic segmentation model performs well for object segmentation with simple texture and obvious boundaries.However,because the image in the actual scene is often affected by the light intensity,it is easy to cause the texture and color features of the objects in the image to be missing,which will cause category confusion,over-segmentation and under-segmentation,and directly affect the results of semantic segmentation.In this thesis,a multi-scale convolutional neural network is used to study the semantic segmentation method for images with uneven illumination.Based on the spatial pyramid pooling module,a pyramid pooling module based on multi-scale context information is proposed.Context information is introduced between different scales so that each branch in the pyramid contains the feature information of the previous branch.Increasing the receptive field to obtain advanced semantic information with more content can effectively improve the accuracy of image segmentation.To verify the effectiveness of this method,the PASCAL VOC 2012 public data set was used to compare with other mainstream methods.(2)Image semantic segmentation method based on multi-scale residual spatial pyramid pooling and global attention mechanismIn the process of image segmentation,due to the large number of object categories contained in each image and the complex image environment,it is difficult to segment objects that do not have prominent geometric structures in the image.For example,a distant object in the entire image is closer to the sky area and has a smaller geometric structure,and is easily misdivided during the segmentation process.Similarly,if the geometric structure of two adjacent object categories is very similar in the whole image,and the difference in pixel values is also small,it will easily cause over-segmentation and under-segmentation to different degrees.In view of the above problems,this thesis uses deep convolutional neural networks to study image semantic segmentation methods based on multi-scale residual spatial pyramid pooling and global attention mechanism.First,a multi-scale residual spatial pyramid pooling module is proposed to obtain more complete high-level features of the image in the network.Second,the network considers global information and proposes a decoder module based on the attention mechanism to fully fuse the low-level features of the image high-level semantic features,so as to effectively capture the texture features,color features and geometric features of the input image,and finally get a complete segmentation result.In order to verify the feasibility of this method,the public datasets of Camvid and Cityscapes were used,and comparative experiments were performed with other methods.
Keywords/Search Tags:semantic segmentation, over-segmentation, under-segmentation, convolutional neural network, attention mechanism
PDF Full Text Request
Related items