Font Size: a A A

Moving Object Detection And Fast Retrieval In Intelligent Video Surveillance

Posted on:2016-10-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:W G FengFull Text:PDF
GTID:1228330470457962Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of technology and the severe challenges of public security, China have carried out the "Safe City" construction and video surveillance have also being widely used in the society and become the major security surveillance method. Efficient analysis of massive surveillance videos needs to be studied and solved in many multimedia applications. The existing problems in the analysis of sentence of smart video surveillance includes:1) Moving object extraction,2) Intelligent analysis of specific event,3) Large-scale object retrieval.This dissertation explores the questions mentioned above. And the main contributions and innovations are summarized as follows:1) People tend to be most interested in the object to that are moving in the video surveillance applications, but dynamic backgrounds and illumination variance can greatly affect the extraction results. For this purpose, this dissertations proposes a robust moving object detection method using local frequency patterns. It extracts pixel-wise local frequency patterns from video frames, and construct a real-time adaptable background model with a non-parametric model. Experimental results on I2R dataset show that the proposed method outperforms the existing methods, and can archive5.46%improvement of the F value.2) In order to monitor specific events in video surveillance applications, this dissertation propose vision-based moving object fall detection method. The foreground human silhouette is extracted via background modeling and tracked throughout the video sequence. The human body is represented with ellipse fitting, and the silhouette motion is modeled by an integrated normalized motion energy image computed over a short-term video sequence. Then, the shape deformation quantified from the fitted silhouettes is used as the features to distinguish different postures of the human. Finally, different postures are classified via a multi-class support vector machine and a context-free grammar-based method that provides longer range temporal constraints can verify the detected falls. Extensive experiments show that the proposed method has achieved a reliable result compared with other common methods and can archive95.2%sensitivity.3) Considered that traditional hash learning methods only support single feature description and can not provide sufficient information, this dissertation proposes all multiple feature hash learning method that can integrate different kinds of feature description methods. Multiple features are implicitly transformed into a nonlinear combined kernel space using kernel trick. Then, a two-step learning method is applied to learn hyperplane projection based hash functions in the kernel space:firstly, the hashing codes of the trainingset can be learnt from the semantic supervision; then, with the expected hashing codes the projection hyperplane can be finally learnt. Experimental results on Youtube faces and manually crawled facial images show that the proposed method can maximally archive7.6%improvement of retrieval precision, and multiple feature integrated hashing results outperform the single feature hashing results.Traditional hashing methods usually suffer from the difference between the high-level semantic description of visual content and the low-level visual descriptors, i.e.,"Semantic Gap". To handle this challenge, this dissertation proposes a deep learning based hash learning scheme. It consists of two methods:stacked RBMs based deep hash (DH) method and CNN based deep perceptual hash (DPH) methods.(1) DH method:It couples the traditional hash learning with the deep architecture and shows the natural relation between the smoothed relaxed hash function and the neural networks. By further employed saturation and orthogonality regularizer, the final compact binary codes are produced.(2) DPH method:It uses CNN to directly generate hashing codes from visual contents and by introducing orthogonality constraints learning method to ensure the compactness of the hashing codes. Experimental results on CIFAR-10show that the proposed methods outperform the traditional "shallow" hash learning methods, and can archive5.72%for DH method and8.17%for DPH method of mean average precision when generating48bit hash codes.Part of research results mentioned above have been applied into projects as follows: the National Natural Science Foundation of China "Large-scale Similar Video P2P Search based on Multi-granularity semantic cues"(No.60975045), the National Key Technologies R&D Program of China "The Research on the Architecture, Key Technologies and Test Specifications of Enhanced Search System"(No.2011BAH11B01), the "Strategic Priority Research Program" of the Chinese Academy of Sciences "Network Video Transmission and Control"(No. XDA06030900).
Keywords/Search Tags:Video Surveillance, Moving Object Detection, Fall Detection, FastRetrieval, Hash Leanring, Deep Learning
PDF Full Text Request
Related items