Font Size: a A A

Image Semantic Segmentation Based On Encoder-decoder Network And Its Applications

Posted on:2022-10-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:C L PengFull Text:PDF
GTID:1488306497488414Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Recent years,benefiting from its powerful feature extraction and model fitting ability,deep learning has acquired great successes in the computer vision field.As an important branch of computer vision,image semantic segmentation has attracted much attention,and several researches have been conducted to address this problem.How to improve the accuracy,speed and application of the semantic segmentation methods becomes the hotspots in the community.However,recent semantic segmentation methods still suffer from several problems such as low use rate of the features,high computational complexity and far from practical application,which bring challenges for the image semantic segmentation field.To overcome the above challenges,this thesis has developed a series of deep models based on encoder-deconder network from the following four aspects:Firstly,to address the problem that existing segmentation networks cannot achieve high quality segmentation results,we combine the encoder-decoder network and the pyramid-structure-based network to improve the spatial segmentation accuracy of the network.Particularly,to improve the efficiency of the pyramid structure for the highlevel feature maps,we develop a novel pyramid structure called stride spatial pyramid pooling(SSPP),which can improve the utilization rate of the high-level feature map from 2.24% to 56.25%.Furthermore,we propose a decoder named mutual guidance attention decoder(MGAD)for the efficient fusion of different level feature maps.The decoder can eliminate the information gap between the high-and low-level feature maps,which benefits their fusion.The experimental results demonstrate that the proposed network and modules can improve segmentation accuracy of the network significantly.Nextly,to address the problem that existing real-time segmentation networks cannot output feature maps with accurate information,we refine the MGAD and propose a new decoder named self guidance attention decoder(SGAD).Different from MGAD,the SGAD changes the optimization objects of the attention branches,which can improve the inferior information capturing ability of hierarchical feature maps caused by the lightweight baseline network.Besides,a pooling fusion module is proposed to eliminate the information gap between the high-and low-level feature maps,leading to their accurate fusion result.The experimental results verify that the encoder-decoder network based on SGAD can attain high segmentation accuracy with a fast speed.Subsequently,to deal with the aerial images which possess a large number of smallscale objects and large resolution,we introduce the SGAD into the aerial scene,and modify the SGAD and the whole network accordingly.Firstly,considering that the aerial images do not have complex detail and boundary information,we remove the spatial branch and pooling layer of the SGAD,which can improve the inference speed of the network.Secondly,to address the problem that the encoder-decoder network overuses the information from the high-level feature maps to refine the low-level feature maps,we propose a layered encoder-decoder network(LEDN),which can improve the refinement efficiency of the low-level feature maps and the small-scale information capturing ability of the network significantly.The experimental results prove that the LDEN can capture more accurate small-scale semantic information in the aerial scene compared with state-of-the-art networks,and the speed of our network is faster than other methods.Therefore,it can provide a new perspective for studying of the smallscale semantic information capturing problem in aerial scene.Finally,we use the encoder-decoder network based on SGAD to measure the wheat stalk cross section microscopic image,and we address two problems in this task.On the one hand,we propose a new network to address the insufficient staining problem of the sclerenchyma.Our proposed network consists of two branches.The first branch is the segmentation branch which undertakes the basic segmentation function of the network.The second is the sclerenchyma recovering branch which can be used to refine the segmentation result of the sclerenchyma.On the other hand,when we adopt normal image cropping method to augment the training dataset,a large number of sub-image pairs whose pixels are all background pixels will be generated,leading to the class imbalance problem.To address this problem,we propose a sector ring image cropping method.All the sub-images generated by our proposed method include three different classes,which eliminates the influence of the class imbalance problem significantly.Extensive experimental results demonstrate that our proposed network and image cropping method can outperform the current state-of-the-art segmentation methods with a large margin,which will provide an important technical support for the phenotypic study regarding the wheat stalk.
Keywords/Search Tags:Deep learning, image semantic segmentation, convolution neural network, encoder-decoder network
PDF Full Text Request
Related items