Font Size: a A A

Research On Temporal Wavelet-Based Video Coding Methods

Posted on:2024-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:C H DongFull Text:PDF
GTID:2568306932955839Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Compared with texts and images,videos can better transmit information.But due to the huge amount of videos,it is very difficult to directly store and transmit videos,so videos need to be compressed.Video coding is to reduce the data amount of videos,thereby reducing the pressure on video storage and transmission.In order to improve the compression rate,quantization is used to reduce the information in videos,but it leads to the distortion of reconstructed videos.Therefore,video coding is to reduce the video bit rate as much as possible,and minimize the distortion of reconstructed videos simultaneously,that is,to improve the rate-distortion(RD)performance.Common video coding is fixed bit rate,which refers to compressing a video into a bit stream and decoding the video according to the whole stream during decoding.Its common applications include video on demand.In addition,there is also scalable video coding.It refers to the amount of a bit stream received during decoding depending on the network bandwidth.When the network bandwidth is limited,only a part of the bit stream is received to reconstruct the video.When the network bandwidth is sufficient,the whole stream is received to obtain the reconstructed video with less distortion.It is very important in streaming media applications such as live broadcasting,so improving the RD performance of scalable video coding is also important.But due to the additional scalability constraints,the research is more difficult.As an excellent signal processing tool,wavelet transform is used by various image coding methods to improve the RD performance,such as the famous JPEG2000.In order to be able to process video,wavelet transform can be extended to temporal wavelet transform,which still retains the prediction and update operations of the wavelet transform,and retains the characteristics of frequency decomposition and multi-resolution analysis,so it also plays an important role in video coding.In this paper,first,for the fixed bit stream video coding,we propose a temporal wavelet-based low-complexity perceptual quality-oriented post-processing method.Then,for more complex scalable video coding,we propose a temporal wavelet-based learnable scalable video coding method to improve the RD performance by combining deep learning and traditional modules.The main research work and contributions of this paper are as follows:(1)For the fixed bit stream video coding,based on temporal wavelet transform,we propose a low-complexity perceptual quality-oriented post-processing method,which can improve the RD performance.We propose to perform temporal frequency analysis on videos to obtain temporal high-frequency frames and temporal low-frequency frames.Since the temporal low-frequency frames are the main content of videos,exclusively enhancing the temporal low-frequency frames can improve the perceptual quality of the video,which can reduce the computational complexity.Based on this idea,we propose a temporal wavelet-based post-processing method,which mainly includes a temporal wavelet transform module for temporal frequency analysis and synthesis,a motion estimation module for motion alignment,and a neural network-based enhancement module for enhancing temporal low-frequency frames.The post-processing method designed in this paper can significantly improve the perceptual quality of compressed video and improve the RD performance of video coding.In addition,compared with other perceptual quality-oriented neural network-based compressed video postprocessing methods,our method has lower computational complexity.(2)For the RD performance improvement of scalable video coding,this paper proposes a temporal wavelet-based learnable scalable video coding method by combining deep learning with traditional modules.Traditional temporal wavelet-based scalable video coding usually contains multiple modules,such as temporal wavelet forward/inverse transform,wavelet subband coding,and so on,but there are two problems.The first is that the temporal wavelet subband coding module is not efficient for scattered non-zero coefficients,and the second is that the temporal wavelet inverse transform does not consider quantization distortion.For the first problem,this paper builds a learnable temporal wavelet subband coding module by the convolutional neural network.The context model constructed by the neural network is used to learn the correlations within and between wavelet subbands,and between bit planes.The probability estimation is carried out by the mixed Gaussian probability model,so as to achieve efficient coding.For the second problem,this paper constructs a learnable temporal wavelet inverse transform by a neural network,which is trained in a data-driven manner,thereby taking into account the problem of quantization distortion.By combining the two modules designed in this paper with the traditional temporal wavelet transform-based module,the temporal wavelet-based learnable scalable video coding method proposed in this paper achieves high RD performance.
Keywords/Search Tags:Video Coding, Temporal Wavelet Transform, Convolutional Neural Network, Scalable Video Coding, Post-Processing
PDF Full Text Request
Related items