Font Size: a A A

Improved Coder-decoder Network Based On Attention Mechanism For Hyperspectral And LIDAR Data Fusion Classification Study

Posted on:2024-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhangFull Text:PDF
GTID:2542307058481944Subject:Engineering
Abstract/Summary:PDF Full Text Request
It is possible to acquire a vast amount of remote sensing detection data using multi-platform and multi-modal technologies,thanks to the advancements in detection sensor technology.These include Hyperspectral Image(HSI)and Light Detection And Ranging(Li DAR)data.Hyperspectral images(HSI)are rich in spectral information encoded in hundreds of narrow and continuous spectral bands,and the powerful discriminatory power of spectral features helps capture subtle spectral differences.Techniques for classification and detection using hyperspectral data have been successfully applied in agriculture,medicine,military and other fields.In addition,hyperspectral data present strong inter-spectral information and weak spatial resolution information,and the use of single-mode hyperspectral data information limits spatial resolutionoriented applications,which makes performance bottlenecks in the processing and analysis of remote sensing data acquired using a single sensor.Light detection and ranging(Li DAR)data can provide information about the height and shape of the image surface relative to the sensor,thus providing complementary information to the HSI data.At the same time,Li DAR data are also used for feature detection and extraction tasks.The combined use of HSI and Li DAR data can effectively address the problem that hyperspectral imagery(HSI)cannot properly distinguish between two complex land covers when producing similar spectral responses.To exploit the complementary information between HSI and Li DAR data,some preliminary multimodal networks have been proposed and used for remote sensing image classification,inspired by the success of various deep learning networks in extracting more discriminative features from the data,in addition to the traditional methods.Deep learning models have been successful in land use and land cover classification tasks and outperform traditional methods.This is due to its ability to automatically implement high and low level feature learning using trainable kernels.Due to the variability of different data sources,the features of objects show great differences in HSI and Li DAR data,which include data size and data structure.Therefore,it is difficult for simple summation-based tandem classification models to produce better classification performance for multimodal data.Several classification methods based on feature fusion and decision fusion have been developed for the joint classification of HSI and Li DAR data.These newly developed methods have proven to be effective in fusing multiple remote sensing data sources.In remote sensing data,the utilization of structural information is very important,and the key to establish complementary links between different data sources is to establish structural relationships between data sources.Meanwhile,the fusion strategy is one of the key factors to determine the performance of multimodal networks.How to effectively establish complementary associations between multiple data sources is still an open problem.Inspired by these works,this thesis proposes a fusion network model with cross-channel reconfiguration feature structure,through which the connections between multiple data sources can be effectively established and mined to provide a better basis for subsequent accurate classification.The works done in this thesis are.(1)In this thesis,University of Houston dataset 2013 and Trento dataset are used as the datasets for experiments,and a novel multimodal remote sensing data fusion classification framework is proposed inspired by the encoder-decoder structure,which is a hyperspectral and LIDAR data fusion classification method based on an improved encoder-decoder network,namely CED-Net network model.The encoder-decoder architecture is a relatively new fusion approach that tends to produce more compact feature representations.In addition,the cross-channel crossfeature fusion network proposed in this thesis follows and goes beyond the common encoderdecoder architecture in terms of structure.The cross-channel prediction enables features extracted from different patterns to be fused in a more adequate manner.More specifically,the crossreconfiguration model used in the feature fusion network proposed in this thesis enables effective information exchange and more compact fusion at the feature level.For the experimental results,this thesis uses overall accuracy(OA),average accuracy(AA)and kappa coefficient(k)for the evaluation of experimental results and achieves superior metric scores.After being compared with other state-of-the-art methods,the approach proposed in this article has achieved the best results,demonstrating the effectiveness of our method in improving the fusion classification accuracy of multi-modal remote sensing data.(2)This thesis also proposes a new framework for multimodal remote sensing data fusion classification,which is an improved coder-decoder network based on the dual-attention mechanism used for multimodal remote sensing data fusion classification.The University of Houston dataset 2013 and Trento dataset are used as the datasets for the experiments.In this thesis,the overall accuracy(OA),average accuracy(AA)and kappa coefficient(k)are used to evaluate the experimental results.The proposed method in this thesis achieves superior index scores.And by comparing with CED-Net method,this network framework has better fusion classification performance.It is proved that the method of this thesis can effectively improve the fusion classification of multimode remote sensing data.
Keywords/Search Tags:Hyperspectral Images, LIDAR, Attention Mechanism, Encoder-Decoder, Fusion Classification
PDF Full Text Request
Related items