Font Size: a A A

Research On Image And Video Super-resolution

Posted on:2022-12-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:D Y ZhangFull Text:PDF
GTID:1488306764960099Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image and video super-resolution(SR)is an important problem in the field of computer vision,which can recover the resolution loss of original data caused by long-distance transmission and compression.Through SR reconstruction of low-resolution(LR)data,high-resolution(HR)content can be obtained and visual effect can be significantly improved.In the multimedia era,images and videos are the most frequently contacted data.In order to improve the visual effect of degraded data,the industry and academia communities are committed to developing new SR algorithms by taking advantage of big data.Thanks to the rapid development of deep learning,the construction of deep model using convolutional neural network(CNN)has been widely used in the latest SR algorithm.Compared with the traditional algorithms based on hand-crafted features,the SR model based on CNN gets rid of the complex parameter adjustment process,which can recover better details only by inputting data into the model,and has strong adaptability.Although the introduction of CNN has brought significant benefits to SR algorithm,due to the poor dynamic reasoning ability of deep learning model,high algorithm complexity and unclear attention mechanism,it is still a great challenge to improve the running efficiency and effect of SR model more efficiently.Based on the deep learning SR model and aiming at the problems faced by existing algorithms,this dissertation conducts in-depth research on the theories and methods of SR model from four perspectives,attention mechanism in CNN,lightweight model design,model adaptability in real-world scenario and application of 3-dimensional convolution.The main contributions are summarized as follows:(1)From the perspective of attention mechanism in CNN,this dissertation proposes a novel kernel attention module(KAM)for single image super-resolution(SISR).Such module enables the network to adjust its receptive field size corresponding to various scales of input by dynamically selecting the appropriate kernel.Based on this,multiple kernel attention modules is stacked with group and residual connection to constitute a novel architecture for SISR,which enables the network to learn more distinguishing representations through filtering the information under different receptive fields.Thus,the proposed network is more sensitive to multi-scale features,which enables the single network to deal with multi-scale SR task by pre-defining the upscaling modules.Besides,other attention mechanisms in SR are also investigated and illustrated in detail in this article,i.e.,channel attention(CA)and spatial attention(SA).The extensive benchmark evaluation shows that the proposed method outperforms the other state-of-the-art methods,leading to a new understanding of architectural design for the SR task.(2)From the perspective of lightweight model,this dissertation proposes a novel lightweight SR model termed as progressive feature fusion network(PFFN).Specifically,to fully exploit the feature maps,a novel progressive attention block(PAB)is proposed as the main building block of PFFN.The proposed PAB adopts several parallel but connected paths with pixel attention,which could significantly increase the receptive field of each layer,distill useful information and finally learn more distinguish feature representations.Besides,this dissertation constructs a pretty concise and effective upsampling module with the help of multi-scale pixel attention,named MPAU.All of the above modules ensure the network can benefit from attention mechanism while still being lightweight enough.Furthermore,a novel training strategy following the cosine annealing learning scheme is proposed to maximize the representation ability of the model,which could improve the quality of the restored image without changing the model structure.(3)From the perspective of application in real-world scenario,an SR model for remote sensing images is built.The adaptability of SR model in real-world scenario is tricky at present.Aiming at remote sensing SR,an innovative mixed high-order attention network(MHAN)is proposed in this dissertation.The proposed model comprises two components: a shallow network for feature extraction,and a deep network with high-order attention mechanism for detail restoration.In the shallow network,the element-wise addition is replaced by weighted channel-wise concatenation in all skip connections,which greatly facilitates the information flow.In the deep network,rather than exploring the first-order statistics(spatial or channel attention),this dissertation introduces the highorder attention(HOA)module to restore the missing details.Finally,to fully exploit hierarchical features,this dissertation introduces the frequency-aware connection to bridge the shallow and deep networks.Experiments demonstrate that the proposed MHAN provides the best performance than existing methods.(4)From the perspective of 3-dimensional(3D)convolution,an efficient algorithm for video SR model is studied.Different from the previous methods based on optical flow for motion compensation,this dissertation provides an efficient 3D convolution block(E3DB)based on the principle of convolution decomposition to capture the temporal relation.After decomposition,three groups of one-dimensional(1D)convolutions are used to replace the traditional 3D convolution,making the model fully utilize the spatiotemporal information of image sequence while maintaining a small amount of computing load.In addition,this dissertation also proposes a novel dynamic multiple branch network(DMBN),which adopts a novel dynamic reconstruction strategy(DRS)in the feature fusion stage,making the network adaptively fuse the optimal information of temporal dependence from each branch.Instead of the simple feature addition or concatenation,the proposed DRS can greatly improve the performance of video SR.
Keywords/Search Tags:super-resolution, attention mechanism, convolutional neural network, feature representation
PDF Full Text Request
Related items