| The richness in continuous spectral information of hyperspectral remote sensing images can reflect the detailed spectral characteristics of features,while their relatively low spatial resolution limits the expression of spatial information.The several discrete bands of multispectral remote sensing images are relatively insufficient for spectral information expression,while their high spatial resolution characteristics are helpful for the more detailed spatial morphology description and feature distribution.The fusion of hyperspectral and multispectral remote sensing images integrates the advantages of the two image sources,and can produce improved hyperspectral remote sensing images with high spatial resolution,which expand the scope of remote sensing applications.This study reviews the present research status of hyperspectral and multispectral remote sensing image fusion algorithms,and proposes four new deep learning fusion algorithms based on the main problems of the current deep learning fusion methods.A large number of experiments show that the algorithms proposed in this study effectively improve the quality of image fusion and significantly outperform the SOTA algorithm.The main contributions of this study include:1)MGDIN,a detail injection network based on multi-scale and global contextual features is proposed,which takes the detail injection framework as a physical constraint,extracts contextual features at multi-scale with the help of residual multi-scale convolution,and captures long-range dependencies of the image through the global context module.During the model training process,a new loss function is proposed by integrating content,spectral and edge losses to improve the spatial and spectral quality of the fused images.The experimental results on five datasets show that MGDIN performs well in terms of both the quality of the fused images and the learning ability,and can obtain higher fusion accuracies with fewer iterations than the compared popular fusion algorithms.The comparative experiments also show that the loss function proposed in this study shows a better performance compared to the common loss functions.2)Integrating Swin Transformer,CNN and GAN architectures,a hyperspectral and multispectral remote sensing image fusion network named Swin GAN is proposed to make up for the shortcomings of the current fusion algorithms that focus on the local information ignoring the long-range dependencies between pixels.Given the parameter adjustment of current algorithms mainly focuses on global consistency without the consideration of the spatial and spectral sub-constraints,this study proposes a new adaptive loss function that uses the L1 loss as the content loss function,and introduces an adaptive spatial and spectral gradient loss function to improves the spatial representation and spectral fidelity of the fused images.The features of hyperspectral and multispectral images are separately extracted by the generator of Swin GAN using a two-branch input approach,which are fused in the feature domain and the generated spatial residuals are injected into the upsampled hyperspectral image to generate the fused image.The discriminator employs a pure CNN architecture to enhance the realism of the generated images.The experiments on the fusion of different scale factors on several datasets show that Swin GAN both spatially and spectrally performs better reconstruction capabilities compared to currently popular algorithms.3)An adaptive multi-perceptual field guided implicit sampling generative adversarial network AMGSGAN is proposed,which improves the interpolation accuracy by taking into account the correlation of different perceptual field pixels.The generator of AMGSGAN consists of three modules: pre-coding,multi-perceptual field feature extraction and adaptive guided implicit sampling.The precoding module encodes the features of the input images to enhance the feature expression,the multiperceptual field feature extraction module extracts the multi-perceptual field features of the images by setting different convolutional dilation ratios,the adaptive guided implicit sampling module uses pixels with different distances from the pixels to be interpolated for feature interpolation,and the fused high spatial and spectral resolution images are adaptively generated based on the interpolation results.The discriminator of AMGSGAN employs a pure CNN architecture and introduces techniques including a batch normalization layer and an adaptive average pooling layer to maintain the network stability and to improve the convergence speed.The comparison experiments on several datasets show that AMGSGAN significantly outperforms current popular algorithms for 4x,8x and 16 x resolutions.4)A new quadtree implicit sampling algorithm named QIS-GAN is proposed,which performs hierarchical sampling in a quadtree manner and employs a lightweight structure as a shallow encoder to alleviate the network burden.In addition,the generative adversarial framework is introduced to enhance the spatial and spectral representation of the fused images.QIS-GAN decrease the excessive computational complexity of current deep learning algorithms due to the use of a large number of stacked CNNs and Transformers in the encoding stage,and achieves a balance between the degree of lightweight and fusion accuracy.The results of fusion experiments with several spatial resolution differences show that the QIS-GAN indicates both excellent fusion performance and lightweight structure.This study proposes four deep learning methods for hyperspectral and multispectral remote sensing image fusion,which outperform the current popular fusion algorithms on several datasets,significantly improve the detailed representation of spatial and spectral features of the fused images,and can be extended to multi-source image fusion and applied to remote sensing image preprocessing work in various fields. |