| Single image super-resolution(SISR),as a crucial technology in the field of lowlevel computer vision tasks,aims to reconstruct a low-resolution image into a highresolution image with rich texture details.It has a wide range of application prospects and significant research value in various fields,such as remote sensing,medical imaging,video reconstruction,and object detection.SISR is essentially an ill-posed problem,because a low-resolution image can always be degraded from countless different highresolution images.Therefore,how to reasonably constrain the super-resolution(SR)process of low-resolution images to high-resolution images has become a core research problem in the field of SR.In recent years,with the booming development of data-driven deep learning technology and its widespread application in image reconstruction tasks,SR algorithms have achieved extraordinary performance on multiple public benchmarks.However,the practical application of these algorithms still faces several problems that need to be solved.On the one hand,the model performance of SR algorithms is usually related to the model size of SR algorithms.Most mainstream SR algorithms are committed to exploring deeper and wider neural network structures to obtain better image SR performance,while neglecting the problem of algorithm lightweight.This limits the application of SR algorithms in some resource-constrained scenarios,such as edge devices.On the other hand,current most of the SR algorithms are trained and tested using ideal datasets generated by bicubic down-sampling algorithms.Although the evaluation indicators on these benchmarks can reflect the image reconstruction performance of different SR algorithms relatively objectively and accurately,these SR algorithms based on specific degradation mode datasets face the problem of significant decline in image reconstruction performance when the method of down-sampling images is altered.Moreover,the degradation mode of images in real-world scenarios is complex and difficult to simulate,which makes it difficult for SR algorithms based on ideal datasets to solve the problem of SR in real-world scenarios.This thesis focuses on the pain points and difficulties of SR tasks in existing real-world scenarios,and conducts research from two aspects: mining self-similarity information and estimating degradation mode information from low-resolution images.Its main work and contribution points are as follows.(1)To balance the performance of SR while achieving algorithm lightweight,this thesis proposes a lightweight wavelet-based Transformer for image SR.Firstly,to address the problem of high computational burden faced by the standard Transformer in pixellevel vision tasks,this thesis proposes a lightweight Transformer backbone network,which includes an efficient Transformer encoder and a global similarity matching module designed for lightweight use.Its core is to efficiently mine and utilize local and global self-similarity information in the low-resolution image using the modeling mechanism of Transformer,reducing the ill-posed problem in the process of image SR and achieving a better balance between model parameter and model performance.Secondly,to ensure that the Transformer-based model can successfully induce bias with limited data,this thesis proposes to use stationary wavelet transform to characterize image features.To ensure the structural features and channel redundancy of wavelet coefficients,an asymmetric convolution-based wavelet coefficient enhancement backbone network is proposed.By comparing and analyzing the objective indicators,visual subjective perception,and parameter size of super-resolution reconstruction with other advanced algorithms,the effectiveness of the proposed SR network based on Transformer and wavelet transform is verified.(2)This thesis proposes an iterative collaborative modeling framework for superresolution and degradation pattern estimation aimed at real-world scene images.It further extends the SR algorithm,which is based on bicubic down-sampling ideal datasets,to SR tasks of real-world scene images with complex degradation patterns.The framework mainly includes an image super-resolution backbone network,an image degradation pattern estimation network based on contrastive learning,and an information fusion module based on attention mechanisms.The degradation pattern estimation network uses contrastive learning to self-supervise and obtain high-dimensional degradation features from the low-resolution image,while the information fusion module integrates highdimensional degradation features with the SR backbone network features.The framework mainly solves the difficulty of obtaining image degradation features and the problem of low utilization efficiency,adaptively learning the correlation between image features and degradation pattern,and improving the performance of relevant algorithms in SR of realworld scenes.Meanwhile,to objectively evaluate the SR performance of the framework in real-world scene images,this thesis constructs a SR dataset for real-world remote sensing images and conducts a detailed and fair comparative evaluation analysis of the proposed modeling framework on this dataset,verifying the effectiveness of the iterative collaborative modeling framework for super-resolution reconstruction and degradation pattern estimation aimed at real-world scene images. |