Image super-resolution(SR),which aims to restore a high-resolution(HR)image from its low-resolution(LR)counterpart,is a pivotal research topic in image processing and computer vision.It has a wide range of applications in medical treatment,security,daily life and other fields.Over the past decade,with the rapid development in deep learning technology,the performance of image SR methods has significantly improved.As a result,deep learning-based SR approaches have progressively gained the most attention in the field of image SR.However,the existing methods encounter challenges when applied to real-world SR tasks.These challenges include difficulties in obtaining large-scale paired datasets directly,dealing with unknown variable image degradations,and enhancing the perception quality of reconstructed images.In deep learning-based image SR methods,the dataset serves as the foundation,the network structure acts as the core,and the loss function plays a crucial role as the driver.To tackle the aforementioned challenges,this thesis conducts research on real-world SR methods from these three perspectives.Through the introduction of dynamic filtering,the design of implicit degradation and the utilization of multi-layer features,this thesis presents a more practical paired dataset generation method,a more adaptable network structure for unknown variable degradations,and a loss function capable of more accurately quantifying perceptual differences.The specific research content and innovative contributions are detailed as follows:(1)A large-scale pairwise dataset generation method based on dynamic degradation learning is proposed.To cope with the difficulty in acquiring pairwise datasets directly for real-world image SR,this thesis explores deep learning-based methods for generating large-scale pairwise datasets.In order to fully leverage the information from real low-resolution images for degradation learning,the input of down-sampling network is modified by incorporating unpaired low-and high-resolution images,in addition to using only high-resolution images as input.Consequently,a dual-path network structure is designed Furthermore,to capture more real degradation information,dynamic filtering is introduced to merge the information from the dual outputs.Compared to existing learning-based pairwise dataset generation methods,this method can introduce imagespecific and position-specific degradation information during paired data generation in a single step,effectively reducing the domain/distribution gap between the generated images and real images.Experiments conducted on the simulation dataset DF2K and the real dataset DPED validate that this method contributes to the enhancement of reconstruction performance of commonly used image SR networks,including a minimum improvement of 5.7‰ on the average Learned Perceptual Image Patch Similarity(LPIPS)metric.This method focuses on SR techniques for real-world images by addressing dataset concerns and generating more practical paired datasets.It offers valuable insights into tackling the challenge of obtaining largescale paired datasets in real-world scenarios.(2)A conditional SR network architecture that incorporates implicit degradation information is proposed.To address the challenge faced by existing end-to-end image SR networks in efficiently handling unknown variable image degradations,this thesis proposes a novel SR network with implicit degradation estimation inspired by previous conditional superresolution networks.The proposed network integrates implicit degradation information by transitioning from the explicit estimation of degradation information from an additional degradation estimation network to an implicit estimation obtained through the internal conditional branch.Additionally,implicit degradation information is introduced into the backbone network using affine transformation.Specifically,a conditional version of ESRGAN(C-ESRGAN)is designed by taking the commonly used end-to-end network ESRGAN as a basis.Experiments conducted on multiple datasets confirm that the proposed method enhances the adaptability of SR networks to unknown variable degradations and leads to improved reconstruction performance in real-world scenarios,including a minimum enhancement of 6.5‰ on the average LPIPS.This method offers a novel approach to effectively deal with the unknown and variable degradations in real-world image SR by focusing on the network structure.(3)A perceptual loss function that combines features from multiple levels is proposed.In response to the challenge faced by commonly used loss functions in achieving high visual perception quality for the reconstructed image,this thesis proposes a new perception loss function.This approach expands the single-layer image features used to construct the perception loss function to multiple layers,and utilizes normalization,weighted summation,and other techniques to effectively fuse features from multiple layers,which provides richer perception supervision information for SR networks.Experiments conducted on several benchmark test sets using different SR networks demonstrate that the proposed method not only enhances texture details in the reconstructed images but also effectively reduces grid-like artifacts.This leads to reconstructed images that achieve a higher visual perception quality while maintaining relatively high quantitative scores.Specifically,the average Peak Signal-to-NoiseRatio(PSNR)improved by at least 1.5%compared to the perceptual loss function based on single-layer image features.Building upon the alteration of distance metric space,this method delves into the selection and fusion of elements within the metric space.It presents a novel research perspective for designing loss functions that can measure perceptual differences more accurately.The proposed methods for generating large-scale pairwise datasets,and designing a conditional SR network and perceptual loss function,enhance the performance of real-world image SR comprehensively.This underscores the importance of exploring SR methods from diverse perspectives and offers vital insights for the modular analysis of deep learning-based image SR methods. |