Font Size: a A A

Research And Application Of Image Semantic Understanding And Segmentation In Limited Scene

Posted on:2021-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2428330611980575Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The purpose of image semantic segmentation technology is to assign category labels to each pixel in the image.Because this technology is closely related to various intelligent applications such as unmanned driving,it has become a research hotspot in computer vision.The rapid development of deep learning has promoted breakthrough research in the field of semantic segmentation.The method of using convolutional neural networks for semantic segmentation is far better than other methods.After continuous exploration and research in the field of deep learning algorithms,it was found that: a large number of convolution and pooling operations in the convolutional neural network reduce the resolution of the image,resulting in a certain loss of spatial information in the pixels,and because the neural network performs feature extraction and learning in units of pixels,the overall understanding of the image is lacking.Therefore,the existing network still lacks in the classification of the categories of objects.Based on the above problems,this article uses a scene context encoding module to encode and learn scene information contained in images in natural scenes.In this method,image segmentation is closely related to the scene in the image,and the semantics of the image are understood globally and the semantic information of the image in the image scene is used to guide the task of semantic segmentation.Therefore,the overall pixel classification is more reasonable and the segmentation results are more accurate.The specific research contents are as follows:(1)This paper proposes a semantic segmentation network based on limited scene context encoding,which incorporates a scene encoding module.Using the scene information in the image helps to understand the object category of the pixels in the image,and the scene information can be used to limit the category classification of the image pixels to the category that matches the scene information.(2)Through the understanding of the image scene information,the scene information can not only guide the network to complete the semantic segmentation task of the image,but also contribute to the classification task of the scene.Therefore,this paper uses the scene coding module to design a multi-task convolutional neural network,which can achieve the purposes of scene classification and image segmentation.(3)In addition to the commonly used evaluation indicators for semantic segmentation,this paper proposes to use gradient vector histogram to analyze the segmented images,compare the experimental segmentation results with the analysis results of the real label images,and use the similarity of the analysis results to be more intuitive The performance of the final model.This article uses the ADE20 K data set for network training and verification under the Py Torch framework,and rearranges the classification labels in the data set for learning in the classification task.The trained network model obtained 78.96% pixel accuracy and 42.54% pixel merge ratio,and was able to obtain 70.80% Top-1 accuracy and 87.90% Top-5 accuracy in scene classification tasks.It is proved that the use of scene information can reduce the probability of unreasonable categories in semantic segmentation,make the overall image segmentation effect more ideal,and at the same time can achieve the task of classifying scenes.
Keywords/Search Tags:image semantic segmentation, scene encoding, multi-task network, convolutional neural network
PDF Full Text Request
Related items