Building is the main resource of urban and rural real estate.It has a wide range of application value in urban planning,land grading and map navigation.As a frequently changing element in geographic data,it is very important to dynamically monitor the buildings in cities and suburbs and accurately and instantly extract the location information of buildings.With the rapid development of remote sensing technology,a large amount of remote sensing data has been provided for urban planning,mapping and disaster monitoring.Building target interpretation based on high-resolution remote sensing images has become the main method of regional basic building information statistics.Because high-resolution remote sensing images have the characteristics of rich feature information and complex detail features,there are problems of unbalanced building distribution,difficult texture detail resolution and overlapping edge parts when accurately extracting buildings.Aiming at the technical difficulties in building extraction from high-resolution remote sensing images,this thesis combines the improved attention valve and the feature pyramid attention module into the U-shaped network to achieve accurate extraction of buildings from high-resolution remote sensing images.The main research work of this thesis is as follows:(1)Aiming at the problem of high complexity of the model basic framework required for building extraction,taking into account the network accuracy and computational cost,this thesis simplifies the fifth structural block in the U-Net codec structure to ensure the accuracy stability of the model and less parameters.In terms of model accuracy,extraction effect,calculation amount and parameter quantity,comparative experiments are carried out on full convolutional neural network,codec network and U-shaped infrastructure.The experimental results show that the proposed U-shaped infrastructure effectively controls the parameter quantity and calculation amount of the model on the basis of ensuring the extraction accuracy.(2)In view of the complex scenes and data redundancy of high-resolution remote sensing images,the applicability of the network to extract buildings from high-resolution remote sensing images is low,and there are problems of missed detection,wrong detection and internal ambiguity of local buildings.In this thesis,the original attention gate module is improved.After adjusting the position of the ’ resampler’ to the Sigmoid function,the effectiveness of the regular building information expression is improved by the double activation function,and the attention valve module is proposed.Adding the attention valve module to the jump connection layer of the U-shaped basic framework can highlight the effective features and suppress the expression of invalid information.The improved attention network is called Attention Gates U Network(AGs-Unet).Based on the WHU and INRIA building datasets,comparative experiments,ablation experiments and scene verification experiments were carried out.The results show that AG modules of different dimensions can extract features of different depths,and the building visualization effect of AGs-Unet with four-layer attention valve module is better than other comparison networks.Compared with several classical models,AGs-Unet improves the applicability of deep learning methods in building extraction tasks,and the accuracy and extraction effect of the proposed method are better than those of other model methods.In the scene where buildings account for a relatively small proportion,the loss of high-dimensional abstract features after the upsampling of the network through multi-layer feature transmission is large,so the network needs to further improve the detail processing of high-dimensional features of buildings.(3)For the high-dimensional feature loss and the detail processing of abstract features,the local building contour edge is blurred and the square is mistakenly detected.Based on the set weight thinking in attention,this thesis introduces a global average pooling branch that can output wide-area features on the basis of spatial pyramid,assigns appropriate weights to different scale feature maps,optimizes global high-dimensional features,and proposes a feature pyramid attention module.On the basis of AGs-Unet,the feature pyramid attention module is embedded in the high-dimensional connection layer to increase the receptive field of the highdimensional feature map and reduce the detail loss in the sampling.The optimized network is called Attention U Feature Pyramid Network(AFP-Net).Based on the WHU and Massachusetts datasets,the segmentation result comparison experiment,ablation experiment,large-scale complex scene verification experiment and model parameter quantity and data quantity comparison experiment are carried out.The experimental results show that AGs can effectively improve the acquisition of building features through attention valves and improve the accuracy of building boundary extraction.FPA reduces the false detection of prominent parts outside the building boundary to a certain extent.The two modules are beneficial to improve the accuracy of building extraction boundary and reduce false detection and missed detection.The AFP-Net proposed in this thesis can effectively extract building contour information and reduce the problems of missed detection and false detection recognition under the condition of lightweight model.Therefore,the improved attention network can extract the global target information through pixel-level features,avoiding the problem of segmentation errors caused by the loss of highdimensional information,thereby improving the segmentation performance of the network and efficiently completing the building extraction of high-resolution remote sensing images.It has broad application prospects in tasks such as urban planning,land grading and map navigation that require dynamic monitoring of buildings in cities and suburbs. |