Font Size: a A A

Research On Infrared And Visible Image Fusion Method Based On Depth Convolution Neural Network

Posted on:2024-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:J X LuFull Text:PDF
GTID:2568307106482854Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Image fusion is an important image processing technique.It aims to generate an image containing complementary information of the source image by specific feature extraction and feature fusion algorithms.In the field of image processing,the fusion of infrared and visible images is also a popular research topic.Infrared images contain thermal radiation information,but due to the limitations of infrared imaging sensors,infrared images are relatively poor in terms of texture details and often struggle to provide sufficient detail information.Visible images,on the other hand,contain a lot of detailed texture information,and the fused image contains complementary information of both,which is beneficial to human visual perception.Infrared and visible image fusion can generate images containing more information than the original image,which is more consistent with human visual perception and also beneficial for downstream tasks.Traditional signal processing-based image fusion methods suffer from poor generalization ability and degraded performance in processing complex image fusion.Deep learning has strong feature extraction capability and its generated results are better.In this thesis,we conduct a study on infrared and visible image fusion based on deep learning techniques.First,this study improves the existing methods and explores the application of two attention mechanisms in deep convolutional neural networks.Specifically,channel attention is combined with deep convolutional neural networks for the purpose of filtering multi-scale features.The proposed method achieves superior performance on publicly available datasets compared to existing methods.In addition,this study also combines axial attention with multiscale deep networks,and the experimental results show that the proposed model can retain more details and edge information.Second,this study proposes a fusion network model for infrared and visible images based on a multiscale Swin-transformer and attention mechanism.The Swin-transformer can extract long-range semantic information in a multi-scale view,and the attention mechanism can weaken the unimportant features in the proposed features and retain the main information.In addition,this thesis proposes a new hybrid feature aggregation module with brightness enhancement module and detail retention module designed for the respective characteristics of infrared and visible images to effectively retain more texture details and infrared target information.The fusion method includes three parts: encoder,feature aggregation and decoder.First,the source image is fed into the encoder to extract multi-scale depth features;then,the feature aggregation is designed to fuse the depth features at each scale;finally,the nested connection-based decoder is used to reconstruct the fused image.Experimental results on publicly available datasets show that the proposed method in this thesis has better fusion performance than other advanced methods.Some of the objective evaluation metrics are optimal.Subjectively,the proposed infrared and visible image fusion method is able to preserve more edge details in the results.In addition,existing image fusion methods suffer from loss of detail information,artifacts,and/or inconsistencies.To solve these problems,the feature extraction network based on axial attention continues to be explored in this study.The network is able to capture long-range semantic dependencies while extracting multi-scale features with strong feature representation capability.The existing fusion strategies suffer from the problem of missing details.To solve this problem,a new fusion strategy is proposed to construct a new attention mechanism by applying entropy features to the aggregation of edge features and detail features.And the proposed entropic attention model is explored.Its ablation experiments demonstrate its effectiveness in the field of image processing.Meanwhile,a new loss function is designed to constrain the network.Finally,this study explores the image feature interaction and proposes a new cross-domain activation interaction module,which is different from the existing interaction,and this module uses an activation function to fuse the pixels that should be discarded to another path,thus avoiding the information discard.It is also combined with Transformer to improve the model performance and achieve the purpose of generating images containing more information.To verify the effectiveness of the proposed method,validation experiments are conducted on a public dataset.The experiments show that the method has advanced advantages in both subjective and objective evaluation compared with existing fusion methods.In addition,the ablation study demonstrates the superiority of the method.
Keywords/Search Tags:Image fusion, Infrared and visible light images, Swin-transformer, Feature aggregation, Entropy attention mechanism
PDF Full Text Request
Related items