| Multi-modal image fusion has been widely researched in academia since it synthesizes complementary image information and removes redundant information in different modalities,producing a fusion image with a larger amount of information,better visual quality and more conducive to the performance of downstream vision tasks.Infrared and visible image fusion is important research direction in multi-modal image fusion.Infrared imaging is based on thermal radiation and heat source target is reflected in infrared images as high grayscale value that can be easily observed and identified.Meanwhile,the quality of infrared images is minimally affected by the environmental factors such as rain,fog and lighting.However,most infrared images often have disadvantages of low contrast and poor texture due to the limitation of sensors.By contrast,visible images have abundant details but are susceptible to environment interference,which lose important target information in the scenes.Therefore,fusion of valuable information from these two modal images can describe the image scene more comprehensively and accurately.This thesis presents researches for infrared and visible image fusion on solving three problems: insufficient representation of high-frequency details in image features,na(?)ve feature fusion strategies and poor fusion performance of most current fusion algorithms in harsh environment.The main contents are as follows:(1)In order to drive model to learn and represent textural details at different scales more comprehensively,the classical multi-scale wavelet decomposition is combined with the feature extraction method based on auto-encoder paradigm to further explicitly extract deep coarse-scale high-frequency structural information from low frequency features.Ablation studies demonstrate the validity of the multi-scale wavelet neural network model proposed in this thesis from both subjective and objective evaluations.Comparative experiments also show that the proposed fusion algorithm achieves better performance than most state-of-the-art fusion algorithms,especially in human visual perception.(2)A new feature fusion strategy SFMD,which is designed for infrared and visible image fusion from the perspective of signal decomposition,is proposed.Based on SPD theory,the feature map patches are decomposed into three independent components:mean intensity,signal structure and signal strength.Considering that the unique characteristics of infrared and visible images,fusion rules for each component are carefully designed.The final fusion feature patches are derived by inverse transformation of SPD.Experiments demonstrate that SFMD feature fusion strategy proposed in the thesis outperforms current feature fusion strategies and obtains good generalizability.(3)A multi-scale neural network model EFMN for joint low-light image enhancement and image fusion is proposed.EFMN consists of two main parts: low-light image enhancement network and multi-scale feature fusion module,for joint enhancement and fusion at feature level.Meanwhile,the processing paradigm of coarsescale feature fusion followed by fine-tuning and design of the loss function in EFMN also improve the visual quality and color fidelity of the fused images.Numerous experiments have demonstrated that EFMN outperforms both most existing fusion algorithms and the approach to enhancing low-light visible images before fusing with infrared images.Finally,the thesis discusses the performance improvement of fused images obtained from EFMN model for downstream vision tasks like target detection. |