High spatial resolution remote sensing images possess abundant spectral and geometric features.By thoroughly learning these features through deep learning,efficient and accurate extraction of rooftops can be achieved.However,although deep learning-based methods can automatically learn features for extracting rooftops,which requires sufficient and high-quality samples.In practical scenarios,obtaining sufficient and high-quality samples is challenging due to subjective factors and time constraints.Moreover,it is difficult to ensure the effectiveness of automatically learned features in such methods,and manual intervention based on practical situations is also quite cumbersome and difficult.Therefore,it is necessary to combine actual production activities to select samples and use the deep learning model to efficiently extract rooftops,or thoroughly learn samples using more complex deep learning models to extract rooftops with high precision.To address these existing issues,this paper uses the open-source WHU building dataset and the self-made Ganzhou building dataset as data sources,and studies the rooftop extraction method based on deep learning in high spatial resolution remote sensing images combined with sample selection.The specific research works are introduced below.(1)Rooftop extraction based on Unet improved by ResNetTo achieve efficient rooftop extraction,an improved Unet based on ResNet(Unet(ResNet))is proposed.First,the structure design of the Unet(ResNet)is discussed.Then,the limitations of using only Unet for rooftop extraction are analyzed,as well as the advantages of using the Unet(ResNet).Finally,the rooftop extraction performance and accuracy of different ResNet architectures as encoders are compared.It is determined that using the ResNet50 architecture as an encoder to improve the Unet achieves the high rooftop extraction accuracy while maintaining training efficiency.(2)Rooftop extraction based on Unet++ improved by ResNeXtAlthough the rooftop extraction based on the Unet(ResNet)can extract rooftops quickly and effectively,a more complex model is needed to fully utilize building samples for effective learning and achieve higher accuracy.Therefore,Unet++ as an improved version of Unet and ResNeXt as an improved version of ResNet are used to extract rooftops based on improved ResNeXt-based Unet++(Unet++(ResNeXt)).It is compared with Unet(ResNet)and different networks and encoders to discuss rooftop extraction accuracy and model training efficiency.(3)Sample selection optimization method based on image similarityIn the rooftop extraction based on deep learning,the building dataset contains images of different urban distribution structures and spectral features.Since the similarity between test images and training images varies,image similarity is established using hash algorithms and histogram algorithms to rank the similarity between images.By setting step sizes based on image similarity rankings,the influence of using different sample selection quantities on rooftop extraction results and accuracy is discussed,and the optimal sample quantity is selected.Finally,rooftop extraction is performed using Unet(ResNet)and Unet++(ResNeXt)combined with sample selection,and the significance of these two improved models combined with sample selection is discussed.This article presents a study on a rooftop extraction method of high spatial resolution remote sensing image based on deep learning using open-source datasets including the aerial imagery dataset and the satellite dataset I(global cities)from the WHU building Dataset,as well as a self-made Ganzhou building dataset.The evaluation of the rooftop extraction was based on the Intersection over Union(Io U),Recall,Precision,Accuracy,and F1 score metrics.The training efficiency of the models was also evaluated based on the Time required to complete one epoch.The F1 score and Time of rooftop extraction using the Unet(ResNet)are0.9878 and 05:22 in the aerial imagery dataset,0.8995 and 00:14 in the global urban satellite dataset,and 0.9106 and 00:35 in the Ganzhou building dataset.The F1 score and Time of rooftop extraction using the Unet++(ResNeXt)are 0.9885 and 14:07 in the aerial imagery dataset,0.9284 and 00:34 in the satellite dataset I(global cities),and 0.9377 and 01:32 in the Ganzhou building dataset.Comparing the two models,Unet(ResNet)showed higher efficiency and maintained high accuracy.By combining sample selection,higher efficiency and accuracy can be achieved for rooftop extraction.Unet++(ResNeXt)showed lower efficiency but achieved higher accuracy,and it can achieve high-precision rooftop extraction without the need for sample selection. |