Font Size: a A A

Research Of Semantic Segmentation Based On Group Dilation Convolution And Cascade Netwrok

Posted on:2019-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y F GuFull Text:PDF
GTID:2428330566997575Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a basic problem in computer vision,semantic segmentation is one of the most important components to help computers understand the world.The traditional methods in semantic segmentation are usually designed for special problems with a simple scene,the whole process is complex,and low features are used to obtain the foreground and background segmentation results.Recently,the development of deep learning also improves the performance of semantic segmentation,which makes a universal semantic segmentation system become possible.Now,the state-of-art semantic segmentation methods are based on fully convolution neural networks.It extracts features from the input image and obtains a feature map with the same size of the original image,each pixel in the feature map can be viewed as the prediction label of the corresponding image pixel in the original image,then we get the semantic information of the whole image.Because the semantic segmentation has a high demand on the characteristics of features,features need to contain the low-level local information to locate the edges of objects,and the high-level contextual information is also needed to recognize the object.Different layers of a deep learning neural network cover different information,but it's hard to cover all kinds of information in one layer.Until now,methods of semantic segmentation are the trade-off between local information and contextual information,so there is still some space for improving.In this paper,we study and analyze the semantic segmentation methods,especially algorithms about dilated convolution.We hope to expand the advantages of dilated convolution in semantic segmentation and extract features more effectively.By analyzing the Deep Lab methods,we find that the dilated convolution can not only reduce the sampling times on the condition of maintaining the receptive field,but also be used to extract multi-scale information in one layer.In this paper,a group dilated convolution structure is proposed to increase the receptive field of the network.The group dilated convolution structure does not need to add additional parameters,it changes the dilation rate of convolution along the direction of convolution channel,and merges features with the convolution with 1?1 kernel size.Because of the poor edge detection,we propose a cascade network to up-sample the feature map to a large resolution.We transform features of different layers in convolutional neural network,then up-sample them to the same size with bilinear interpolation method,combine all these features and learn it for the final prediction.In this way,we obtain a feature map with high resolution than before.This paper mainly uses the PASCAL VOC dataset for experiments,experimental results show that the improved network can effectively improve the performance of large objects and the edges of objects,and the average Io U in the VOC2012 is increased from 72.99% to 76.46%.
Keywords/Search Tags:semantic segmentation, convolutional neural network, group dilation convolution, cascade network
PDF Full Text Request
Related items