Font Size: a A A

Semantic Segmentation Of Remote Sensing Images Based On Coupled Dual-path Features

Posted on:2024-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:D D FengFull Text:PDF
GTID:2542306929973749Subject:Resources and environment
Abstract/Summary:PDF Full Text Request
Semantic segmentation of remote sensing images is becoming increasingly vital in various fields,such as urban planning,autonomous driving,disaster monitoring,and land cover classification.Although traditional remote sensing image segmentation methods are easy to understand and apply,have low data and computer hardware requirements,and offer fast processing speeds,they have poor generalization and are not suitable for high-precision segmentation of complex samples due to limited data processing capacity.On the other hand,deep learning methods for semantic segmentation of remote sensing images offer several advantages,such as the ability to process a wide range of images,low economic cost,and high processing efficiency.However,these methods have poor interpretability and require large data sets and high-end computer hardware.To overcome the limitations of traditional and deep learning-based remote sensing image segmentation methods,this paper proposes a hybrid approach that uses both methods simultaneously.The features extracted from the double path are combined to achieve complementary results,taking into account the underlying features of the image while optimizing the neural network.This approach addresses three key problems in semantic segmentation of remote sensing images,namely,semantic segmentation of unbalanced data of target background,multi-scale segmentation of high-precision remote sensing images,and case segmentation of multivariate data fusion.The specific work contents of this paper are as follows:(1)It is challenging for a neural network to effectively extract features from datasets with limited data and unbalanced target backgrounds.To address this issue,this paper proposes a semantic segmentation method called GLCM-U-Net,which is based on the GLCM-U-Net architecture coupled with the Grey Level Co-occurrence Matrix(GLCM).GLCM-U-Net inputs data into a small parameter prediction network to predict the GLCM parameters adaptively.Texture features are then extracted from the image using GLCM after parameter prediction,and feature maps are transformed and unified in terms of dimension,direction,and channel.The transformed feature is then input into a texture supervision module for target feature correction.The corrected feature map is coupled to each stage of U-Net decoding to strengthen the construction of feature information and weaken the influence of unbalanced targets and backgrounds during the segmentation process.Results show that for a typical pavement crack dataset with uneven target backgrounds,GLCM-U-Net achieves a 4.05% higher m Io U than U-Net.In a remote sensing dataset of buildings,the accurate segmentation rate is 70.05%.(2)High-resolution remote sensing images present a challenge for multi-scale,large-scale,and high-precision semantic segmentation due to their high intraclass diversity and low interclass separability.To address this issue,this paper proposes a Swin-S-GF remote sensing image semantic segmentation method based on Swin Transformer and Gabor filter.Firstly,the Swin-S backbone network is utilized to extract image information at different levels.Then,the input image’s texture and edge features are extracted by Gabor filtering,and multilevel features are fused using the feature aggregation module and the attention embedding module.Finally,the segmentation results are optimized using a fully connected conditional random field.This method achieves m Io U values of 80.14%,66.50%,and 70.61%respectively for the large-scale land classification set,fine land cover classification set,and " AI+ RS" semantic segmentation datasets of different scales.Compared to the second-best method Deep Lab V3,the m Io U value of our method is improved by 0.67%,3.43%,and 3.80%for these respective datasets.(3)In the process of image instance segmentation,the deep learning network is only used or over-relied on,and there is a lack of multivariate data fusion and multiple approaches.In view of the lack of data support and quantitative analysis in the current research on urban villages,this paper,based on multivariate spatial data such as high-resolution remote sensing image(GF-1),building contour and point of interest(POI),takes the main urban area of Guangzhou as the research area,and uses deep learning tool in ENVI to extract the urban village boundary.The correct identification rate of urban villages is 64.31%.As for the confusion with some old residential areas and industrial areas in the extraction results,high-resolution remote sensing images were further divided by the road network to produce the label data of urban villages.Combined with the support vector machine(SVM)method in machine learning,the extraction accuracy of this method can reach 90.19%.
Keywords/Search Tags:Semantic Segmentation of Remote Sensing Images, Grey Level Co-occurrence Matrix(GLCM), U-Net, Swin Transformer, Gabor Filter
PDF Full Text Request
Related items