Font Size: a A A

Video Denoising Based On Deep Prior Learning

Posted on:2024-09-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:L SunFull Text:PDF
GTID:1528307340953729Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
High-quality videos have important demands in national defense safety,industrial production,and audio-visual entertainment.However,due to limitations in camera sensors and shooting environments,the captured videos usually contain various types of noise.The noise not only reduces the quality of videos,but also greatly affects subsequent high-level visual processing tasks,such as object detection,classification and tracking,resulting in a significant decrease in accuracy.Therefore,it is crucial to research effective video denoising methods to improve video quality.The key of the video denoising is how to exploit spatial-temporal information and improve the denoising performance,the existing video denoising methods can be classified into two categories: model-based methods and deep learning-based methods.On one hand,the modelbased methods consider video denoising as an optimization problem,model the noise model and the prior/regularization items of natural images,and obtain the iterative solution.However,due to the diversity of natural images,the artificial prior knowledge can not accurately characterize the properties of natural images,limiting the denoising performance of modelbased methods.On the other hand,deep learning-based methods exploit deep convolutional neural networks to learn the mapping function from noisy frames to clean frames from a large number of external datasets.Although deep learning-based methods can obtain excellent denoising performance,the structure design of these methods lacks theoretical guidance,resulting in insufficient interpretability.To address the above problems,this paper combines the current two kinds of denoising methods and reasearchs a video denoising method to integrate deep denoising prior.Firstly,introducing domain knowledge to guide network design,the correlation between video frames is characterized based on the maximum a posteriori probability estimation framework.Second,mixed structure noise is processed to address the challenge of more complex noise scenarios.Third,to enhance the ability to deal with realworld noise,this paper learns the distribution of real-world noise based on a deep network.Finally,considering the computational limitations of mobile devices,a lightweight video denoising method is proposed.In summary,this article enhances the quality of video frame restoration by considering domain knowledge,accurately learning prior knowledge and efficiently utilizing spatial-temporal information.The main research content and innovation points of this article are as follows:1.We proposed a Maximum a Posteriori(MAP)estimation-based deep video denoising method.To the best of our knowledge,this is the first research to introduce the Deep Denoising Prior(DDP)in the video denoising problem,we propose a MAP estimation-based iterative video denoising algorithm,and unfold the iterative denoising algorithm into a network to construct an effective multi-stage video denoising network.In addition,due to the relative motion between adjacent frames and the current frame to be restored,adjacent frames generally perform motion estimation and motion compensation operations first.We observed the alignment errors,that is,aligned adjacent frames would generate non-uniform artifacts.To deal with this problem,we develop a kernel predicting network to estimate pixel-wise fusion weights and adaptively integrate temporal correlations.Extensive experimental results demonstrate that the proposed method can obtain significantly better denoising performance than other video denoising algorithms.In addition,the experiments of video super-resolution and compressed video enhancement have verified the generalization ability of the proposed method.2.We proposed a deep unfolding network-based mixed video noise removal method.Most existing video denoising methods aim at removing simulated additive Gaussian noise,and the distribution of real-world noise is non-uniform and more complicated,resulting in most current video denoising methods having difficulty applying practical scenarios.To address the above issues,this paper studied the mixed video noise removal problem.To enhance the interpretability of the model,we first model the mixed video noise components,and develop a robust iterative optimization algorithm based on the MAP estimation framework to learn denoising prior and remove the mixed video noise.To implement end-to-end training,we unfold the proposed iterative algorithm into a deep network.Furthermore,we proposed a multi-stage progressive fusion strategy to reduce computational complexity.In particular,the network predicts weighted filters to achieve the adaptive fusion of temporal information.The experimental results show that the proposed method is significantly better than other video denoising methods,and the experiments of video decaptioning also prove the robustness of the proposed method.3.We proposed a non-parametric noise modeling method based on a variance estimation network.Due to the limitations of existing real-world noisy datasets,such as heavy reliance on post-processing for generating clean images and lack of outdoor shooting scenes,we construct a real-world noisy benchmark dataset by considering more indoor and outdoor shooting scenarios and different lighting conditions.In addition,since parameterized noise models are difficult to capture the complex characteristics of real-world noise,we proposed a pixel-wise variance estimation method.First,we develop a variance estimation method based on nonlocal block matching,and then use the estimated variance maps as the label for training an effective variance estimation network.The experimental results of noise modeling and realworld denoising demonstrate that the simulated noise data generated by the proposed method can accurately model real sensor noise.4.We proposed a multi-scale spatial-temporal memory network-based lightweight video denoising method.Existing deep-learning-based denoising methods design sophisticated network structures to obtain a large performance gain,leading to a significant increase in computational burdens.To tackle this problem,we proposed a lightweight multi-scale video denoising algorithm.Based on the MAP estimation,we first design a variance estimation module and a fusion module on each scale of the pyramid to learn the denoising prior and achieve fast video denoising.Secondly,to effectively use global temporal information,this paper proposes a robust memory enhancement module to adaptively search effective temporal information.Different from current time-consuming motion estimation and motion compensation methods,we exploit a patch-based non-local network to reconstruct the clean frame of the current moment.The real-world denoising experimental results demonstrate that,compared with existing lightweight video denoising methods,the proposed video denoising method can achieve better denoise performance on the premise of smaller computing burdens.To summarize,to address the problems of poor interpretability,limited noise types,and high computational complexity in most learning-based video denoising methods,this paper proposes a video denoising framework based on maximum a posteriori probability estimation,and effectively solves the denoising problems in complex noise environments and real-world noisy scenarios.Finally,a lightweight video denoising method is developed for practical applications.Extensive experimental results demonstrate the effectiveness and robustness of proposed algorithms.
Keywords/Search Tags:video denoising, deep learning, denoising prior, maximum a posteriori estimation, mixed video noise removal, noise modeling, lightweight denoising network
PDF Full Text Request
Related items