Font Size: a A A

Deep Learning-Based Texture Analysis And Synthesis For Video Coding

Posted on:2022-09-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:K YangFull Text:PDF
GTID:1488306323462824Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the increasing progress of electronic technology,high-quality multimedia equipments have gradually stepped into our daily life,high-definition and ultra-high-definition digital video contents have gradually become popular.However,the emer-gence of mass video has presented tremendous challenges to the storage and transmis-sion of the videos.As the state-of-the-art of video coding standards,the High Efficiency Video Coding(HEVC)adopts a large number of novel coding technologies.As the re-sult,it achieves about 50%bitrate savings at the same subjective quality compared with the last generation of the video coding standard(H.264/AVC).Even so,there still exists a huge gap between the video compression rate and the demand for video compression.Therefore,how to design a more efficient video coding method is a serious challenge in the video coding field.The traditional coding method adopts the hybrid coding framework and divides the image into tree-shaped structural units using flexible block partition techniques.Based on the assumption that the signal is stationary within the unit,it selects the best coding mode from multiple coding modes at the expense of complexity.This method is still based on Shannon's information theory and has been continuously fine-tuned from the perspective of signal processing for decades.However,the statistical characteristics of natural video are very complex.It often exhibits non-stationary signal characteristics in textures,edges and so on.The traditional coding methods cannot compress these con-tents effectively,which consumes a large number of bits.However,on the other hand,the human eyes are not sensitive to these contents,even if there are slight differences,it will not be noticed.At the same time,deep learning technologies have showed power-ful non-linear learning capabilities in computer vision tasks,it also has achieved great success in the field of video coding.Therefore,this dissertation focus on how to use the deep learning technology combined with the classical video coding technologies to further improve the coding efficiency of textures.The main innovations and contributions of this dissertation are listed as follows.(1)This dissertation proposes a deep learning based nonlinear transform coding scheme for the static texture contents.First,this dissertation designs a neural network for non-linear transform and integrates it into the intra coding framework successfully.Secondly,this dissertation proposes that the intra prediction information can be used to remove the directional information in the intra prediction residuals to improve the per-formance of the transformation.Thirdly,this paper proposes to use the transformation gain as a loss function during the training of the model to enhance the energy compact of the transform coefficients,the TopK training strategy is also adapted to reduce the number of bits to encode the transform coefficients.The experimental results show that the proposed deep learning based non-linear transform coding scheme can significantly improve the compression performance of the texture contents in the video compared with the traditional methods.(2)This dissertation proposes a dynamic texture detection method for the dynamic texture contents in video.First,this dissertation proposes a dynamic texture detection algorithm based on the histogram of the direction of motions.This method is simple and fast,and it has been successfully integrated into the video coding framework.Sec-ondly,based on the spatio-temporal correlation within the dynamic texture contents,this dissertation proposes a postprocessing method to refine the results of dynamic texture detection which improves the detecting accuracy of the dynamic textures.Experimental results demonstrate that the proposed dynamic texture detection method can detect the dynamic texture content in the natural videos quickly and accurately which satisfies our encoding requirements.(3)This dissertation proposes a dynamic texture detection and synthesis based video coding scheme for dynamic texture contents in the natural video.First,this dis-sertation designs a generative adversarial network that uses spatio-temporal information for dynamic texture synthesis,and presents a method that uses a spatial discriminative network and a temporal discriminative network to enhance the spatial fidelity and tem-poral coherence of the synthesis results.Secondly,this dissertation proposes a video coding scheme that combines dynamic texture detection and dynamic texture synthe-sis,and integrates it into inter coding successfully,which greatly improves the coding efficiency of existing coding frameworks.Finally,this dissertation collects and builds a dynamic texture video training(validation)dataset for neural network training,and establishes a dataset of dynamic texture coding test sequences for dynamic texture re-search.Experimental results show that the proposed dynamic texture detection and synthesis based video coding scheme can significantly improve the coding efficiency of dynamic texture videos.
Keywords/Search Tags:Video Coding, Deep Learning, Transform, Static Texture, Dynamic Texture, Texture Detection, Texture Synthesis
PDF Full Text Request
Related items