Font Size: a A A

Research On Semantic Segmentation Of Urban Remote Sensing Image Based On Attention Mechanism And Feature Fusion

Posted on:2024-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:D B RenFull Text:PDF
GTID:2542307079992769Subject:Electronic Information (Communication Engineering) (Professional Degree)
Abstract/Summary:PDF Full Text Request
Semantic segmentation for urban remote sensing images is one of the most crucial tasks in the field of remote sensing.Remote sensing images contain rich information on ground objects,such as shape,location,and boundary and can be found in highresolution remote sensing images.It is exceedingly challenging to identify remote sensing images because of the large intraclass variance and low interclass variance caused by these objects.On the other hand,the existence of small-scale objects in urban scenes can easily lead to leakage.These two factors will greatly interfere with the correct interpretation of ground objects in the image.In view of the problems in semantic segmentation of high-resolution urban remote sensing images,this paper starts from the process of image acquisition,quality improvement,semantic segmentation network design,training and optimization,and interpretation software design.The main research contents are as follows:This thesis proposes a multi-scale channel attention fusion network(MCAFNet)based on a Transformer and CNN,which improves the blurring of image segmentation boundary and the phenomenon of small target feature leakage.MCAFNet uses Res Net-50 and Vit-B/16 to learn the global-local context,and this strengthens the semantic feature representation.Specifically,a global–local transformer block(GLTB)is deployed in the encoder stage.This design handles image details at low resolution and extracts global image features better than previous methods.In the decoder module,a channel attention optimization module and a fusion module are added to better integrate high-and low-dimensional feature maps,which enhances the network’s ability to obtain small-scale semantic information.The proposed method is conducted on the ISPRS Vaihingen and Potsdam datasets.Both quantitative and qualitative evaluations show the competitive performance of MCAFNet in comparison to the performance of the mainstream methods.In addition,we performed extensive ablation experiments on the Vaihingen dataset to test the effectiveness of multiple network components.The experimental results show that MCAFNet effectively improves the segmentation accuracy,reaching 90.8% accuracy.This thesis proposes a hybrid class attention network(HCANet)based on a Swin Transformer and CNN,which solves the problem of intraclass dissimilarity and similarity in object segmentation and reduces the computational load of models.The model introduces the Swin Transformer to extract features in the encoder part to model the remote spatial dependency.In the decoder section,class channel attention(CCA)module and class augmented attention(CAA)module are designed to calculate classbased correlation and recalibrate class-level information,and reduce feature redundancy through region representation to improve the efficiency of self-attention mechanism.Extensive ablation experiments were carried out at the ISPRS Vaihingen and Potsdam to test the effectiveness of network modules.At the same time,the proposed method is compared with the existing method.The experimental results show that the proposed model achieves a balance between interpretation accuracy and computational complexity.This thesis devises a smart city remote sensing interpretation software by using machine learning and deep learning technology.The software encapsulates the proposed semantics segmentation network and mainstream interpretation models,integrates remote sensing image processing functions.At the same time,it implements the grayscale,threshold segmentation,geometric transformation,image enhancement,blurring,noise and other operations of remote sensing image to meet the common needs of users for remote sensing image processing.The experimental results show that the interpretation software improves the segmentation accuracy and speed of highresolution urban remote sensing images.
Keywords/Search Tags:remote sensing image, deep learning, convolutional neural network, self-attention mechanism, semantic segmentation
PDF Full Text Request
Related items