Font Size: a A A

Deep Learning Methods For Multi-frame Quality Enhancement On Compressed Video

Posted on:2024-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:D Y LuoFull Text:PDF
GTID:2558307079476314Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,digital video has become the most important multimedia carriers for human to acquire information and perceive the world.However,uncompressed video will generate a huge amount of data.Therefore,when transmitting video over the Internet with limited bandwidth,it is necessary to use video coding algorithms to compress video.However,due to lossy quantization and block-based encoding,compression artifacts will inevitably be introduced into compressed video,which will seriously reduce the quality of visual experience.At the same time,compression artifacts will also reduce the accuracy of other subsequent visual tasks,such as object detection and motion recognition.Therefore,it is necessary to study quality enhancement for compressed video.Based on the above research background,this thesis adopts deep learning method and framework to closely focus on compressed video quality enhancement from two perspectives: alignment-free and alignment-based.A quality enhancement method for compressed video without alignment based on multi-scale and attention mechanism,a quality enhancement algorithm based on 3D convolution and 2D deformable convolution(DCN)alignment and a quality enhancement algorithm based on multi-path DCN alignment are proposed respectively.In the end,without changing the interior of the video coding software,the objective quality of the compressed video and the subjective quality of the human eye are improved.The main research results are as follows:1.To solve the problem that long-term and short-term memory networks without alignment have huge computational costs and parameters,a new compressed video quality enhancement algorithm without alignment is proposed in this thesis.Firstly,multiscale feature extraction and early fusion strategy are used to extract spatial information from video frames and integrate temporal information from sequences.Then,two-stream enhancement module based on the attention mechanism is utilized to further increase the ability of the network to extract spatial information and gather useful temporal information from the fused features.Compared with the methods based on long-term and short-term memory network,the proposed algorithm not only shows less computation and parameters,but also more effectively enhances the quality of compressed video.2.To solve the problem that accurate alignment offset is difficult to be predicted based only on 2D DCN,a compressed video quality enhancement algorithm based on 3D convolution and 2D DCN alignment is proposed in this thesis.Firstly,several layers of3 D convolution are used to roughly fuse spatial-temporal information in video sequence,and then a multi-level residual fusion module is developed to generate global and local fused fine features from different levels for predicting deformable offsets.Finally,the aligned feature map is passed through the reconstruction module to obtain the enhanced residual,which is then added pixel-by-pixel with the original reference frame to generate the reconstructed frame.This algorithm shows better performance both subjectively and objectively than only based on 2D DCN.3.In order to solve the problem that a deformable convolution kernel is difficult to train and the recovery of spatial details is inaccurate,this thesis proposes a compressed video quality enhancement algorithm focusing on spatio-temporal detail recovery.Firstly,a multi-path deformable alignment module is used to generate more accurate offsets to explore more accurate temporal details.Secondly,a residual dense connection block with attention mechanism is utilized to recover more valuable spatial details.At the same time,a compensation loss is developed to overcome the shortcoming of restored spatial details being too over-smooth due to pixel-level loss training only.This algorithm can not only increase objective quality of the compressed distorted video at quantization parameter 37 by 0.98 d B on average,but also effectively reduce various compression artifacts in the distorted video.
Keywords/Search Tags:Deep learning, video coding, video quality enhancement, convolutional neural network, deformable convolution, attention mechanism
PDF Full Text Request
Related items