| Remote sensing image semantic segmentation is a new technology emerging with the development of remote sensing technology.It is one of the important applications of remote sensing image intelligent interpretation and computer vision.Its goal is to perform full-scene,pixel-level semantic recognition of remote sensing images.The rapid development of airborne platforms such as satellites and UAVs as well as remote sensing sensors has greatly enriched the types and quantities of remote sensing images.Massive multi-source remote sensing data not only lays a solid foundation for efficient remote sensing semantic segmentation research,but also puts forward higher requirements and challenges for the segmentation accuracy and generalization performance of the model.In recent years,driven by the acceleration of massive data,model algorithms and computing power,deep convolutional neural networks have made major breakthroughs in the field of image processing with their powerful nonlinear feature learning capabilities.Inspired by these,aiming at the characteristics of remote sensing images and the limitations of existing methods,we have carried out in-depth research and exploration on the application of deep learning technology in remote sensing image semantic segmentation tasks from the perspectives of enhanced feature representation,multi-source data discrimination and complementary feature mining,and model lightweight design.The main contents of this study are as follows:1.A remote sensing semantic segmentation method based on edge-aware and spectral-spatial information aggregation is proposed to alleviate the problems that remote sensing images are susceptible to edge semantic ambiguity and difficult to represent spatial spectral features.It improves the semantic awareness of the model by transforming spectral information,spatial context information and edge information into effective feature representations.Specifically,first,we design a two-stream spectral-spatial feature extraction network to enhance the discriminative feature representation via a 3D hybrid convolutional network and a multi-stage aggregation network.Second,a Siamese edge-aware network and multi-level edge loss function are designed to eliminate the effect of edge semantic ambiguity.Experiments on two public datasets demonstrate that the proposed method can effectively enhance feature representation and reduce edge semantic ambiguity.Furthermore,the proposed method also achieves a balance between speed and accuracy.2.A remote sensing image semantic segmentation method based on multi-source collaborative enhanced fusion is proposed to alleviate the problems of difficulty in mining complementary features of multi-source remote sensing and large scale changes of remote sensing objects.It enhances the model’s understanding of complex scenes by building multi-source semantic feature extraction methods and fusion strategies.Specifically,first,we design a collaborative enhanced fusion module to mine the complementary features of multi-source remote sensing images.Second,we propose a multi-scale feature decoder to improve the model’s ability to represent small objects and large-scale varying features by learning scale invariance.Experiments on two public datasets demonstrate that the proposed method can effectively reduce the impact of redundant information on model performance and improve the model’s ability to perceive complementary features.3.A multi-source remote sensing semantic segmentation method based on differential feature fusion is proposed to alleviate the problems that existing upsampling methods ignore the pixel semantic difference,which affects the quality of feature reconstruction and the difficulty of multi-source data discriminative feature mining.It enhances the feature representation ability of the model through a multi-source attention fusion mechanism and rich contextual information in the decoding stage.Specifically,first,we achieve full fusion of multi-source remote sensing features through a differential feature fusion module and an unsupervised adversarial loss.Second,we propose a shallow pixel-guided upsampling strategy to better achieve decoded feature reconstruction without introducing extra parameters.Experiments on two public datasets demonstrate that the proposed method can effectively fuse multi-source data features,enrich contextual information,and improve the quality of feature reconstruction.4.A remote sensing semantic segmentation method based on multi-granularity semantic alignment distillation learning is proposed to alleviate the problem that largescale remote sensing semantic segmentation models are difficult to deploy and apply on devices with limited resources and high real-time requirements.Specifically,first,we design a local pixel difference distillation module to mine the consistency of pixel and class feature distributions.It can effectively alleviate the impact of intra-class differences on segmentation performance.Second,we propose a global affinity discrimination module to model higher-order affinity spatial relationships,which can enable the student network to learn more about the internal structural information of the teacher network.Experiments on three backbone networks and two public datasets demonstrate that the proposed method can maintain the accuracy of the segmentation model while reducing the amount of model parameters.The above research work carried out in this thesis not only enriches the research content of deep feature-based learning of remote sensing image semantic segmentation,but also expands the application scope of deep learning algorithms.In addition,it also promotes the research and development of deep learning technology in the field of remote sensing intelligent interpretation.There are 47 figures,25 tables,and 157 references in this thesis. |