Font Size: a A A

A Study Of Scalable Video Coding For Traffic Surveillance

Posted on:2013-09-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P LiuFull Text:PDF
GTID:1228330395489259Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Scalable video coding (SVC, referring H.264/SVC in this paper) can form the temporal, spatial and SNR quality multilayer stream only by once high compression rate encoding, and the required layered bit steam can be extracted according to user needs, network environment and end-user features. This technical advantage is very suitable for the application of video surveillance system, and provides strong support for the compression, storage and transmission of the surveillance videos. In the field of video surveillance, traffic surveillance is the primary means to solve the traffic problems by extracting various traffic informations. More researches are focused on traffic objects segmentation, analysis, recognition and tracking, traffic characteristic parameters extraction and analysis, traffic events recognition and understanding, but the researches on the video compression coding of traffic surveillance are comparably few. In fact, the massive traffic surveillance videos need faster and more efficient compression encoding technology, and must consider the video retrieval etc. problems of compressed and non-compressed domain during encoding stage. Hence, the video compression of traffic surveillance is not an ordinary and isolated problem, but a system problem to pre-process and interact with the video/image analysis module. This issue has become an emerging research area, which has great applied values for traffic security etc. fields.In view of this, the scalable video coding for traffic surveillance is studied in our paper, the main contents and novelties are as follows:1) Fast inter prediction algorithm of temporal scalability for traffic surveillance videoScalable video coding uses the hierarchical B frames to achieve temporal scalability, whose computational complexity significantly increases the coding time. The latest improved algorithm can save the coding time by about45%, but doesn’t optimize for the traffic surveillance video features. In this paper a fast algorithm suitable for the hierarchical B frames based on the background difference method is proposed. First, the background image is gotten by the improved single Gaussian method based on spatio-temporal model. Second, the difference image is gotten by the current image subtracting the background image; the difference variances of each macro block and the sliding windows of the four directions are computed in order to remove the jitter effect, the motion region is acquire by comparing the minimal variance and the threshold. Then the analysis is processed by combining with the statistical data of the key parameters which will affect the macro block mode selection of the hierarchical B frames in the traffic surveillance to reduce the possible coding mode, so the coding speed is improved. The experimental results show that compared with the standard algorithm the proposed algorithm can save the coding time by about85%without degrading coding efficiency and decoding video quality.2) Key frame extraction algorithm based on visual attention modelFor the growing surge of massive traffic surveillance videos, key frame extraction is an important technology related to video retrieval, summary, browsing and compression. In this paper a key frame extraction algorithm based on visual attention model is proposed for lane surveillance video. First, the top-down method is used to detect moving objects whose position saliency is decided by the clearest position of license plates and vehicles. Then within the moving objects the bottom-up method is used to calculate the moving orientation and moving intensity saliency of the moving objects. Next the visual attention curve is fused by a simple adaptive linear mode. Last a derivative curve is generated, the frame with the most salient value in those zero-crossing points from the positive to the negative on derivative curve is selected as key frame. Experiments show that the key frames extracted by the proposed algorithm not only include the optimal or suboptimal positions of all passed vehicles, but also include on-street parking, speeding and reverse driving etc. traffic incidents. The results are consistent with the traffic observers’visual perception, and conducive to the extraction of vehicle static features to form the traffic video features database.3) Tracking aware based proportionate GOP adaptation coding for fast retrievalBase on the key frame extraction, the I-frames at the temporal0level are formed, in which the traffic events and traffic objects are encoded by the extended syntax and semantics as surveillance information of H.264/SVC. The unified defined syntax and semantic standardize the retrieval interface, which greatly improve the retrieval speed in the compressed domain and meet the post-analysis needs. But the key frames being encoded as I-frames will cause the uncertainty of the I-frame positions, thus the inter-frame correlation and temporal scalability of hierarchical B-picture are damaged in different degrees. In this paper the proportionate GOP adaptation structure is proposed, and the temporal scalability of any size GOP is realized by the proposed binary tree algorithm. This adaptive structure favors video retrieval and video summary generation, but there will be some loss of rate-distortion performance as part of the I-frame insertion. The traffic videos are different from the generic audiovisual services and broadcast television applications, its main aim is to analyze the high-level semantics based on the accurate traffic object tracking. Hence, the tracking accuracy instead of PSNR is utilized as the compression criterion, more low tracking interesting bit rate is reduced by optimizing the quantization of frequency coefficients. Experiments show that the method can save about60%bit rate compared with the conventional method while maintaining comparable tracking accuracy.4) Content-adaptive traffic surveillance video coding with extended spatial scalabilityRegions of interest (ROI) or visually salient regions are rarely considered in spatial scalable video coding, thus visually important content can not be better adapted to lower display resolutions. In this paper a content-adaptive spatial scalable coding for traffic surveillance video is proposed. First, the background image is extracted by an improved single Gaussian method. Then a background subtraction algorithm is present for detecting and tracking vehicles, the motion window of the leading vehicle is commonly referred to as ROI in traffic surveillance, and ROI is as cropping window in extended spatial scalability (ESS) of the scalable video coding (SVC). Moreover, we employ ROI-based quantization strategy and frequency coefficient suppression technique to improve the rate-distortion performance of enhancement spatial layer. The experimental results show that compared with the conventional scaling coding the proposed algorithm can greatly improve the visual perception of decoded base layer video with limited loss in rate-distortion performance. Also, the tracking accuracy instead of PSNR can be utilized as the compression criterion, by which the encoding performance can be improved more.5) Error resilience and concealment algorithms for traffic surveillance videoBy the development of3G and4G/LTE technology, the mobile and wireless video services increase significantly. Because of the time-varying, high bit error rate, limited bandwidth characteristics of mobile radio channels, the transmission errors will affect decoded quality of SVC bit stream. In this paper the feedback-based error tracking (ET) and reference picture selection (RPS) algorithms of the spatial scalability are proposed, which combine with the inter-layer prediction features. And in the choice of intra refresh, the visual salient regions of traffic videos are considered. Moreover, adaptive frame loss error concealment algorithm in spatial enhancement layer is proposed. Based on the statistical analysis of the enhancement layer macro block coding mode, the optimal error concealment method is adoptively chosen by using the basic layer and inter-frame information. If the tracking accuracy instead of PSNR is utilized as the compression criterion, the compression ratio is greatly improved. Part of the saving bit rate is utilized to introduce redundancy into the transmitted bit stream for error resilience. Under the given bit rate, the tracking accuracy of received video is greatly improved. The above error control techniques can be used in combination according to the requirements, which provide the conductive error resiliency for high-level semantic analysis of the transmitted bit stream in the error-prone channel.
Keywords/Search Tags:Traffic surveillance, Scalable video coding, Fast encoding, AdaptiveGOP, Surveillance extended encoding, Tracking aware, Error resilience, Content adaptive
PDF Full Text Request
Related items