Font Size: a A A

Research On Deep Learning Model For Semantically Segmenting High Resolution Remote Sensing Imagery By Considering Efficiency And Unlabeled Samples

Posted on:2022-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:X LinFull Text:PDF
GTID:2480306740955719Subject:Surveying and Mapping project
Abstract/Summary:PDF Full Text Request
High-resolution remote sensing image has rich detailed information such as the shape and texture of ground objects,and the classification information of ground objects obtained from it can be widely used in agriculture,environment and disaster.However,the increase of image resolution increases the scene complexity,data volume and intra-class variation,which makes the traditional remote sensing image classification methods that rely heavily on manually designed features and do not consider advanced semantic information unable to obtain feature classification information from high-resolution remote sensing images quickly and accurately.Therefore,it is necessary to study the fast and accurate classification methods applicable to high-resolution remote sensing images.Deep learning technology can accurately and efficiently extract the deep semantic features of images,and has become an important method for semantic segmentation(i.e.classification)of high-resolution remote sensing images.However,it still has some shortcomings: the current deep learning high-resolution remote sensing image semantic segmentation models are mainly divided into two categories,namely U-shaped structure models and dilated FCN structure models.The U-shaped structure model improves the inference operation speed of the model and reduces computation by decreasing the resolution of the feature map step by step,but this method will reduce the accuracy of the model,and the dilated FCN structure model with higher accuracy usually has more computation and slower inference operation speed.The two types of models cannot achieve a balance between classification speed,accuracy and computational effort.In addition,the existing deep learning models require a large number of labeled training samples to obtain more desirable results and cannot fully utilize unlabeled samples.The following work has been carried out in this thesis to address the above issues:(1)Through an in-depth study of the model structures of the U-shaped structural model and the dilated FCN structural model,the advantages and shortcomings of each model are analyzed and summarized,and the reasons why the semantic segmentation accuracy of the U-shaped structural model is lower than that of the dilated FCN structural model are identified.(2)To address the two problems of feature alignment and limited fusion receptive field in U-shaped structure model,this thesis proposes the feature alignment module(FAM)and the multi-scale fusion module(MSFM)respectively,and on this basis,a high-resolution remote sensing image semantic segmentation model-DFDNet that can balance the speed,accuracy and computation volume is constructed.(3)Based on the UAVid dataset,detailed ablation experiments are conducted for each module in the DFDNet model,and detailed comparisons are made with UNet,PSPNet,Deep Labv3 and OCNet models in terms of image semantic segmentation accuracy,speed and calculated amount.Finally,the effectiveness of DFDNet is further verified on Vaihingen and Postdam datasets.(4)To address the problem that the deep learning high-resolution remote sensing image semantic segmentation model overly relies on a large number of labeled training samples,this thesis combines the DFDNet and generative adversarial network,and introduces a semi-supervised training method to construct a semi-supervised semantic segmentation model for high-resolution remote sensing images.Finally,the effectiveness of the semi-supervised semantic segmentation model is verified on the vaihingen dataset.Based on the above work,this thesis concludes that: The computational volume of the DFDNet model proposed in this thesis is 1/4 of that of the dilated FCN structure model,and the computing speed is twice that of the dilated FCN structure model.The accuracy of semantic segmentation of high-resolution remote sensing images is comparable to or even higher than that of the dilated FCN structure model,which verifies the effectiveness of the method in this thesis.When the semi-supervised semantic segmentation model is trained with 1/2,3/4 and 7/8unlabeled training samples(the other 1/2,1/4 and 1/8 are labeled samples),the accuracy of image semantic segmentation are obtained as 76.08%,74.67% and69.99% in order.If only labeled samples are used for model training,the accuracy of image semantic segmentation is 73.96%,71.05% and 67.63% when the ratio of the number of samples used is 1/2,1/4 and 1/8,which is significantly lower than the accuracy of segmentation by adding unlabeled training samples.This indicates that the semi-supervised semantic segmentation model can make full use of the unlabeled samples to supplement the labeled training samples,thus alleviating the reliance on the labeled training samples for semantic segmentation of deep learning high-resolution remote sensing images.
Keywords/Search Tags:deep learning, high-resolution remote sensing imagery, semantic segmentation, image classification, convolutional neural network
PDF Full Text Request
Related items