| In recent years,with the continuous development of satellite remote sensing technology,the amount of high-resolution remote sensing image data that can be obtained has rapidly increased.Compared with medium and low-resolution remote sensing images,high-resolution remote sensing images have better spatial details,such as higher ground resolution and richer geometric and texture details,which require more efficient methods to interpret and improve the availability of the data.Traditional remote sensing interpretation methods based on bottom-up visual features cannot effectively extract advanced semantic information contained in high-resolution remote sensing images,while semantic segmentation methods based on deep learning can effectively bridge the gap between these two types of features.Currently,fully convolutional neural network models have been widely studied and applied in the field of semantic segmentation of high-resolution remote sensing images.However,due to the inherent limitations of the receptive field of the convolutional kernel,there are still shortcomings in capturing the global pixel dependency relationships and multi-scale features.In addition,increasing the depth of the network often leads to a loss of shallow details and produces a large number of parameters,which reduces the accuracy and efficiency of semantic segmentation.To address these issues,this paper presents research on deep neural networks from the aspects of enhancing global semantic features,reducing redundant parameters,and fusing multi-scale features across layers,and achieves the following main contributions:(1)To reduce redundant parameters and enhance global semantic features,this paper proposes a ultralightweight high-resolution remote sensing image semantic segmentation method based on convolutional neural networks.The method adopts cross-layer multi-scale feature interaction fusion and efficient spatial pyramid convolution technology to effectively extract multi-scale features while lightening the model.Combined with the cyclic cross-attention mechanism,it further enhances the global features of the model and improves the accuracy of segmentation of complex scenes.The effectiveness of the proposed method was verified on public datasets.(2)To further enhance the global semantic perception of the model and consider the contextual relationship of semantic categories in multi-scale feature extraction,this paper proposes a multi-scale full Transformer method for high-resolution remote sensing image semantic segmentation.The method uses a multi-level pyramid Vision Transformer to model global semantics,and uses semantic category mask technology to simultaneously consider the correlation between features and semantic categories to generate the final prediction.In addition,multi-scale features from different levels are enhanced and fused in the Token space to further optimize the prediction results,and the effectiveness of the model is verified on public datasets.(3)To promote the practical application of intelligent remote sensing interpretation technology,this paper designs and implements a high-resolution remote sensing image semantic segmentation prototype system.This system is based on the global feature enhancement network algorithm proposed in this article to achieve multi element semantic segmentation of remote sensing images,which can be applied in fields such as land resource utilization/coverage mapping,urban planning,disaster area extraction,and environmental monitoring. |