Font Size: a A A

Research And Application On Semantic Segmentation Based On Multi-scale Contex Information

Posted on:2021-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ShangFull Text:PDF
GTID:2428330605474873Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is a pixel-level classification task,which classifies all pixels in an image according to the semantic content they represent.It acts a pivotal part in the application scenarios of remote sensing image interpretation,autonomous driving,medical image analysis,and drone navigation.With the rapid development of automatic driving technology and the improvement of image understanding requirements,the existing image recognition can no longer meet the refinement needs because it can only draw certain defined frames,so image semantic segmentation has become an important method for image recognition and understanding in autonomous driving tasks.Because the street scene image has the characteristics of large scale variation,occlusion and manual labeling difficulty,the image semantic segmentation is difficult.Based on spatial pyramid pooling to extract multi-scale context information of images,the problem of image semantic segmentation is mainly studied based on the street scene.The main research contents are as follows:(1)For the semantic segmentation of street scene images based on the spatial pyramid pooling method,it is easy to be affected by lighting and occlusion.When merging multi-scale features,it is easy to ignore the boundary information,and the classification of categories with few target pixels is not accurate.This paper proposes a xiphoid spatial pyramid pooling method which incorporate detailed information,and further propose an encoder-decoder structure model.The model extracts the multi-scale contextual information integrated into the low-level features through xiphoid spatial pyramid pooling method based on atrous convolution and average pooling operations.And it gradually restores the target boundary through the decoder.Experimental results on the public dataset Cityscapes,it is proved that the method performs better in semantic segmentation of images under the conditions of occlusion and fewer target pixels.(2)Aiming at the problem that the performance of the segmentation network model depends on the image generation network when training with generated images,and the parameters of the image generation network will be fixed when training the segmentation model and cannot be effectively updated.And the accuracy is not balanced between categories when segmenting different categories.In order to solve these two problems,this paper proposes a bidirectional domain adaptive learning method based on maximum squared loss.This method effectively update the image generation network when training the segmentation network model through the alternating learning between image generation network and segmentation adaptive network,further improving the performance of image generation and image segmentation.And it uses the maximum squared loss to alleviate the current class imbalance problem in the current domain adaptive learning.Experimental results on the synthetic dataset GTA5 and the real scene dataset Cityscapes,it is proved that this method can improve the class imbalance problem existing in the current bidirectional domain adaptive learning method and show good image segmentation performance.(3)Based on the above research results,the method of semantic segmentation is extended to the field of image recognition,and a street scene recognition application based on the semantic segmentation model is designed.The application mainly includes two modules of model training and street scene image recognition.For the image to be recognized,the image is segmented using a semantic segmentation method,and a corresponding category label is added to the segmented image,and the recognition result is finally output.The practical operation of the system shows that the system can effectively identify street scenes.
Keywords/Search Tags:Semantic Segmentation, Deep Convolution Neural Network, Spatial Pyramid Pooling, Domain Adaptation Learning
PDF Full Text Request
Related items