Font Size: a A A

Study On Algorithm Of Moving Object Detection And Tracking In Complicated And Dynamic Scenes

Posted on:2011-05-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:B N ZhongFull Text:PDF
GTID:1118330338489434Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years intelligent video surveillance technique is popularly used due to serious public safety situation. The goal of intelligent video surveillance technique is to automatically analyze the captured videos for finding potential danger, illegal behaviors or suspicious targets in the scenes. Thus, the technique can provide real-time alarm, early warning, storage and later retrieval. This intelligent video surveillance task underlies several subfields, such as signal processing, image processing, pattern recognition, machine learning, computer vision, artificial intelligence, data mining and multi-media retrieval. From low level to high level, the research content of an intelligent video surveillance system mainly comprises the following key components: image preprocessing, background modeling, moving object detection, visual tracking, object recognition and classification, semantic scene understanding, behavior analysis and continuously tracking multiple objects between multiple cameras. Among these components, the most two basic algorithms are moving object detection and tracking and they have attracted significant attention in the literature. However, practical experience has shown that moving object detection and tracking technologies are currently far from mature. A great number of challenges need to be solved before one can implement a robust visual tracking system for commercial applications, such as fast moving object detection, pose variations, scale variations, appearance variations of the object, illumination changes, non-rigid shape variations, occlusions, cluttered scenes, dynamic scenes, etc. This dissertation proposes several algorithms to addressing these problems, such as moving object detection and segmentation in dynamic scenes, adaptive tracking via patch-based appearance model and local background estimation, and visual tracking via weakly supervised learning from multiple imperfect oracles. The main work and contributions of this thesis are as follows.Firstly, moving object detection in dynamic scenes is addressed. The challenges in dynamic scenes include trees waving, water rippling, moving shadow, illumination changes, camera jitters, cloud, smoke, fog, rain, etc. After analyzing the video data from dynamic scenes, we found that neighboring pixels tend to be similarly affected by environmental effects (e.g., dynamic scenes.) and can explicitly utilize the correlation of image variations (i.e., co-occurrence statistics) at neighboring pixels to achieve robust detection performance. Thus, we systematically study how to modeling the co-occurrence statistics at neighboring pixels in dynamic scenes and propose three algorithms, such as, texture and motion pattern fusion for moving object detection, standard variance feature for moving object detection, and local histogram of figure/ground segmentation for moving object detection. Experimental results verify that the proposed algorithms can achieve robust and effective moving object detection when they explicitly utilize the co-occurrence statistics at neighboring pixels.Secondly, we propose a background subtraction driven seeds selection method for moving object segmentation. Specifically, the proposed method can be divided into three main steps. First, we use a novel BGS method as attention mechanisms, generating many possible foreground pixels by tuning it for low false-positives and false-negatives as much as possible. Second, a connected components algorithm is used to give bounding boxes of the labeled foreground pixels. Finally, matting of the object associated to a given bounding box is performed using a heuristic seeds selection scheme. This segmentation and matting task is guided by top-down knowledge. Experimental results demonstrate the efficiency and effectiveness of the proposed method.Thirdly, we propose a robust visual tracking algorithm via a patch-based adaptive appearance model driven by local background estimation. Long-term persistent tracking in ever-changing environments is a challenging task, which often requires addressing difficult object appearance update problems. To solve them, most top-performing methods rely on online learning-based algorithms. Unfortunately, one inherent problem of online learning-based trackers is drift, a gradual adaptation of the tracker to non-targets. To simultaneously address the tracker drift and occlusion problem, we propose a robust visual tracking algorithm via a patch-based adaptive appearance model driven by local background estimation. First, an object is represented with a patch-based appearance model, in which each patch outputs a confidence map during the tracking. Then, these confidence maps are combined via a robust estimator to finally get more robust and accurate tracking results. Moreover, we present a local spatial co-occurrence based background modeling approach to automatically estimate the local context back-ground model of an interested object captured from a single camera, which may be stationary or moving. Finally, we utilize local background estimation to provide supervision to an analysis of possible occlusions and the adaption of patch-based appearance model of an object. Qualitative and quantitative experimental results on challenging videos demonstrate the robustness of the proposed method. Fourthly, we systematically study long-term persistent tracking in ever-changing environments and propose a general tracking framework via weakly supervised learning from multiple imperfect oracles. Within this framework, we consider visual tracking in a novel weakly supervised learning scenario where (possibly noisy) labels but no ground truth is provided by multiple imperfect oracles (i.e., trackers), some of which may be mediocre. The problem of learning from multiple labeling sources is different from the unsupervised, supervised, semi-supervised or transductive learning problems, in which each training instance is given a set of candidate class labels provided by different labelers with varying accuracy and the ground truth label of each instance is unknown. Without the ground truth, how to learn classifiers, evaluate the labelers, infer the ground truth label of each data point, and estimate the difficulty of each data point are the main issues addressed by the task of learning from multiple labeling sources. Our method has the following advantages: (1) We propose a natural way of fusing multiple imperfect oracles to get a final reliable and accurate tracking result. The imperfect oracles can be arbitrary tracking algorithms in the literature. This avoids the pitfalls of purely single tracking approaches.(2) The proposed algorithm gives an estimate of the ground truth labeling of training data during tracking in a robust probabilistic inference manner and thus can alleviate the tracker drift problem.(3) We can online evaluate tracking algorithms in the absence of ground truth, which is an important and challenging problem in visual tracking systems.(4) The proposed approach can also handle missing labels (i.e., each tracker is not required to label all the image patches).(5) We propose a scalable and off the shelf tracking framework, in which the imperfect oracles are not necessarily to be primitive trackers, but may be rather powerful and, perhaps problem specific trackers.
Keywords/Search Tags:moving object detection, moving object segmentation, visual tracking, weakly supervised learning
PDF Full Text Request
Related items