Font Size: a A A

Research On Learning Based Image Coding And Soft Video Broadcast Technology

Posted on:2021-04-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:W B YinFull Text:PDF
GTID:1368330614450731Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous advancement of information technology,new applications such as online video,social networking and video surveillance supported by the internet and mobile terminals are emerging,symbolizing that humanity has entered the era of big data.Various types of multimedia data have grown dramatically while image and video data have sprung up to a higher order of magnitude.Research on efficient image/video storage and transmission methods has become a core issue in big data processing.Herefore,as the typical applications of image storage and video transmission,image coding and wireless soft video broadcast have important research significance.Transform is the core technology in the traditional image coding and soft video broadcast framework.However,the fixed linear transform is not adaptive to the characteristics of image/video,so that it can not achieve compact representation for image and video;meanwhile,the inverse transform based decoding structure limits the utilization of the prior knowledge,resulting in poor reconstruction quality and coding efficiency.Sparse representation and convolutional neural networks employ the redundant basis and the nonlinear model respectively,to obtain image/video characteristics adaptability and achieve efficient and compact representation of the signal;their flexible representation structure can efficiently integrate prior knowledge and further improve the quality of image/video reconstruction.To this end,this thesis introduces sparse representation and convolutional neural networks into the traditional image coding and soft video broadcast frameworks.For the problem of wireless soft video broadcast and image compression,in this thesis,the sparse representation theory and convolutional neural network are introduced into the traditional image coding and soft video broadcast framework.From the perspective of using the learning-based method to improve the performance of image coding and soft video broadcast,including the following sections:First,a dictionary learning based soft video broadcast framework is proposed.In the conventional soft video broadcast framework,fixed transforms cannot adapt to video signal characteristics and cannot represents the video efficiently,thus its performance is obviously limited under the condition of low bandwidth;meanwhile,inverse transform based decoding structure cannot utilize prior knowledge to further improve decoding quality.The compressive sensing can utilize the signal sparsity,and achieve almost accurately reconstruction of the signal using an optimization method based on a small amount of observation data.To this end,this thesis utilizes sparse coding and reconstruction to replace the transform and inverse transform process of traditional wireless video soft broadcating scheme,and proposes a wireless soft video broadcast scheme based on compressed sensing and hierarchical frame structure.At the encoder side,under the condition of limited bandwidth,the hierarchical frame structure is used to reasonably allocate the observation rate to obtain the maximum effective information of video signal.In particular,utilize the same importance and natural anti-packetloss capability of the observation data to reduce the channel protection cost.At the decoder side,based on the prior information of local sparsity,non-local self-similarity and time domain correlation of video signals,an group sparse representation baa sed optimal reconstruction model and an efficient model solving algorithm are proposed to achieve high quality and high robustness video reconsruction.The experimental results show that the proposed framework has nice video broadcasting scalability and significantly improves the compression efficiency.Second,a low-rank approximation based line-scan soft video broadcast framework is proposed.In the traditional soft video broadcast scheme,the lack of prediction leads to low coding efficiency.If the closed-loop prediction is added,the error drift problem will be caused.Meanwhile,using a block or frame as a coding unit causes it to obtain a sufficient amount of line data to start encoding and transmission,thereby causing a high delay problem.According to distributed coding theory,the source can be efficiently compressed when the decoder has side information,and it is robust to error propagation.The low rank representation is an essentially structured sparse representation.Under the low-rank representation framework of image/video,the correlation in image and video is implicit in a low rank matrix form,and the low rank representation provides a way for high-precision approximation of image and video.To this end,this thesis utilizes coding coset which is a typical technical of distributed coding instead of transform coding in wireless video soft transmission,and proposes a line-scan soft video broadcast method based on low-rank matrix completion.At the encoder side,there is no need to wait for a number of rows of data to be collected,and the video lin e data is compressed by coset coding in real time,which reduces the computational cost of the encoder.At the decoder side,based on the analysis of time domain and spatial domain correlation for video signal,using the template matching technique,a lowrank approximation based side information generation algorithm is proposed.The accuracy of the coset decoding and the quality of reconstructed video are improved by the high-precision side information.The experimental results show that the proposed scheme achieves better performance than the traditional video soft transmission scheme under various channel conditions.Third,a convolutional neural network based soft video broadcast post-processing framework is proposed.The traditional soft video broadcast scheme uses block-based discrete cosine transform to compress redundant information in the video.When the bandwidth is limited,a certain number of transform coefficients need to be discarded,and the signal is inevitably subjected to noise inter ference in the transmission,so that the received data at the decoder side includes the coding noise and channel noise.However,the coding and channel noise still exists in the decoded frame by the inverse transform.In fact,the sparse representation and deep neural networks can adapt to video characteristics and achieve high-precision reconstruction of degraded video frames.For this purpose,based on sparse representation and deep neural networks,this thesis proposes a convolutional neural network based robust soft video broadcast scheme.The proposed framework regards encoding as an image degradation process,and the encoder controls the degree of degradation according to bandwidth conditions.At the decoder side,the decoding process is transformed into the restoration problem of degraded image.The local sparsity and non-local self-similarity of the video frames are used to establish the group based sparse representation model and a sparse representation based image restoration method is proposed.On this basis,the strong nonlinear characteristics of the convolutional neural network are used to further reduce the impact of encoding and transmission noise.The experimental results show that the proposed scheme not only has nice video broadcast scalability,but also can remove the noise in coding and transmission.Provides visually friendly subjective and objective quality compared to traditional methods.Fourth,a deep neural network based end-to-end correlated images compression scheme is proposed.Conventional image coding methods adopt a fixed linear transform which cannot adapt to image characteristics and difficult to describe the complex texture and structure of the image.Meanwhile,the separate optimization on codecs limits the improvement of compression performance.The autoencoder utilizes a multi-layer network to represent high-dimensional data as low-dimensional features and to restore original data to a maximum by low-dimensional features;it uses a nonlinear model to obtain a compact feature representation that is superior to a linear transform.In addition,the flexible codec structure and joint optimization of codec provide the possibility of efficient image compression.Therefore,based on the multi-stream autoencoder,this thesis applies co-reference constraints to each feature level,and propses a correlated image compression scheme based on co-prediction structure.A binarizer is used to quantize the image features,and the image features are used to generate an importance map to guide the rate allocation of the quantized features.The rate distortion optimization cost is taken as the optimization objective of the network,and the feasibility of network training is guaranteed by the differentiable binarizer.The proposed method realizes joint optimization of image codec through end-to-end autoencoder structure,and realizes efficient compression for correlated images through multi-stream coreference structure.The experimental results show that the proposed method can obtain subjective and objective quality superior to the traditional image coding methods for the related image compression problem.
Keywords/Search Tags:wireless soft video broadcast, image coding, joint source channel coding, sparse representation, deep learning, distributed coding
PDF Full Text Request
Related items