Font Size: a A A

Research Of Video Compression Coding Communication System Based On Rate Distortion Optimization Of Region Of Interest

Posted on:2020-05-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z W ZhangFull Text:PDF
GTID:1368330575495016Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Video compression coding based on the region of interest mode has become a hot topic in the field of video compression and computer vision.In a broad sense,a region of interest in a video refers to a portion of a pixel in a video frame that draws attention.It usually includes the target of motion in the video and the area of color change.The key idea of video coding of the region of interest is to compress and encode this part of the region with a smaller quantization step to obtain higher coding precision.For the non-interest area part,coarse coding is performed with a larger quantization step size to reduce the bits of the overall coding output.According to the requirements of the human visual system,the purpose of video coding of the region of interest is to clearly present the specific region of interest at the decoding end.For non-interest regions,people's attention is not in this part,so it is not necessary to completely Guarantee the quality of the coding in this part.In other words,in a specific application scenario,in order to reduce the coding rate as much as possible,only the coding precision of the portion of the region of interest is preserved so as not to affect the understanding of the video content.This dissertation designs a video coding communication system based on ROI rate distortion optimization.The system includes ROI extraction module,ROI video coding rate distortion module and ROI video stream transmission module.Around this system,this paper mainly studies three core technologies of the three modules:region of interest extraction technology,rate-distortion optimization technology for video coding of region of interest,and transmission technology of video stream of interest region in wireless network environment.These three technologies are connected back and forth to form a video communication system in the coding mode of the region of interest.Among them,the region of interest extraction technology mainly studies how to extract the region of interest from the video frame data,which mainly refers to the region of motion and some specific target objects.This part of the area is the foreground part of the video frame,while the other areas are used as the background part.The rate-distortion optimization technique for video coding of the region of interest mainly addresses the problem of rate and distortion trade-off in video coding.That is,given a set of video sequences such that the rate is constrained,the distortion of the set of sequences is minimized.In solving this optimization problem,how to establish an accurate and accurate rate distortion model is a key part.The rate distortion model mathematically describes the rate and distortion in the region of interest coding mode.The objective function and constraints of the rate-distortion optimization problem are listed by the rate-distortion model,and the problem is solved.The bit allocation scheme of each frame of the video group sequence is obtained,and then the rate control strategy is designed.The video stream transmission technology of the interest area mainly uses the heterogeneous wireless network as a background,and encapsulates the coding information of the coding units in the video frame to form transmission units of the network layer and distributes the transmission units to wireless channels of different attributes for transmission.The mode of transmission of the heterogeneous wireless network is still based on the end-to-end transmission mode,however the terminal has multiple township access attributes.Generally speaking,a terminal has a plurality of network access unit interfaces,and can simultaneously access wireless networks with different attributes.The video streaming transmission pattern in the region of interest coding mode ensures that the transmission unit containing the region of interest information can have less transmission distortion and decoding distortion.At the same time,the transmission of the video stream needs to meet the real-time requirements.Video stream packets that exceed the delay cutoff portion are discarded to conserve network resources.In addition,the channel error control coding technique is introduced in the transmission process,and the error rate is reduced by introducing additional supervised bits,and the information of the region of interest is completely decoded and reconstructed as much as possible.In view of the above,this paper has carried out in-depth and detailed research on the key technologies of video coding in the region of interest.The main contents include:1.The region of interest extraction technique was studied.Combine traditional digital image processing theory with current popular deep learning theory.We propose two new methods for extracting and detecting regions of interest:cascade model algorithm and text box theme based bounding box correction algorithm.Among them,the cascade detection algorithm has four cascade steps:global motion compensation,motion block extraction,multi-layer pixel segmentation and model update.The first two steps extract the foreground motion block and form a motion mask.The next two steps remove the pixels belonging to the background inside the motion mask and update the color distribution of the background model.In addition,a block-to-pixel detection idea is proposed to achieve detection flexibility.Another benefit of the proposed method is that it can be embedded in a video codec for real-time ROI detection and encoding.Experimental results show that the proposed method achieves improved performance in terms of detection accuracy and time consumption.The bounding box correction algorithm of the text topic model belongs to the machine learning algorithm.It consists of two phases:model training and validation.In the training phase,it converts the feature point information of the detected target image into text information.Based on the Latent Dirichlet Allocation(LDA),we propose a topic model with a word co-occurrence a priori,in which the co-occurrence information between image features is fully utilized.In the verification phase,we propose a correction algorithm based on the Anchor-box,which can quickly detect the detection results corresponding to the pre-trained topic model from some traditional algorithms and has a fast detection time.Experiments with various data sets show that the proposed method can improve detection performance in terms of efficiency and computational cost.It is also robust to different objects such as color,illumination,and scale.Interestingly,the proposed method can be combined with many fast but inaccurate regions of interest extraction algorithms and enhances the flexibility of the system model.2.The rate-distortion optimization and rate control techniques for video compression in the region of interest are studied.We propose a rate-distortion model for the region of interest coding mode based on the mixed distribution and radial basis function neural network of DCT residual coefficients.Rate distortion is modeled by classifying coding units into different depth,texture features.Next,using the proposed rate-distortion model,the objective function and constraints of the rate-distortion optimization problem are listed and solved according to the convex optimization theory.At the same time,a rate control strategy for the coding mode of the region of interest is designed.Through experimental verification,our proposed method has achieved corresponding improvements in visual quality,rate distortion performance and bit rate accuracy of decoding and reconstruction.It achieves higher coding accuracy for the region of interest while maintaining a stable output of the code buffer,and the distortion is within the controllable range.3.The transmission technology of video stream in wireless heterogeneous network environment under the coding mode of interest area is studied.We propose a video transmission framework based on the coding mode of interest,which is based on a heterogeneous wireless network environment of multihomed access terminals.It comprises a module of a region of interest extraction module and a frame separator,wherein the coding unit is classified and encapsulated into a network transmission unit.The framework also includes a channel monitor that monitors the status of each communication path and sends a feedback signal to the video stream controller for packet scheduling control.We propose a deep learning method for channel state prediction.In order to solve the problem of video stream packet transmission,we designed a rate-distortion model for video stream transmission in the region of interest coding mode,and formulated a transmission scheduling strategy.This strategy seeks a balance between transmission delay and distortion.It also guarantees that packets with ROI content are transmitted on paths with sufficient bandwidth and low loss.Through simulation experiments with other transmission methods,it is verified that the proposed scheme achieves good results in video transmission quality,end-to-end delay and playback fluency.
Keywords/Search Tags:Video compression coding of interest area, deep learning, machine learning, computer vision, rate distortion optimization, video streaming strategy, target detection, foreground background separation, multiple township terminals
PDF Full Text Request
Related items