Font Size: a A A

Moving Object Classification In Video Surveillance

Posted on:2009-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:G DongFull Text:PDF
GTID:2178360242480516Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
As an active research topic in computer vision, visual surveillance in dynamic scenes attempts to detect, recognize and track certain objects from image sequences, and more generally to understand and describe object behaviors. The aim is to develop intelligent visual surveillance to replace the traditional passive video surveillance that is proving ineffective as the number of cameras exceeds the capability of human operators to monitor them. In short, the goal of visual surveillance is not only to put cameras in the place of human eyes, but also to accomplish the entire surveillance task as automatically as possible.Visual surveillance in dynamic scenes has a wide range of potential applications, such as a security guard for communities and important buildings, traffic surveillance in cities and expressways, detection of military targets, etc. We focus in this paper on applications involving the surveillance of people or vehicles, as they are typical of surveillance applications in general, and include the full range of surveillance methods. Surveillance applications involving people or vehicles include the following.1) Access control in special areas. In some security-sensitive locations such as military bases and important governmental units, only people with a special identity are allowed to enter. A biometric feature database including legal visitors is built beforehand using biometric techniques. When somebody is about to enter, the system could automatically obtain the visitor ' s features, such as height, facial appearance and walking gait from images taken in real time, and then decide whether the visitor can be cleared for entry.2) Person-specific identification in certain scenes. Personal identification at a distance by a smart surveillance system can help the police to catch suspects. The police may build a biometric feature database of suspects, and place visual surveillance systems at locations where the suspects usually appear, e.g., subway stations, casinos, etc. The systems automatically recognize and judge whether or not the people in view are suspects. If yes, alarms are given immediately. Such systems with face recognition have already been used at public sites, but the reliability is too low for police requirements.3) Crowd flux statistics and congestion analysis. Using techniques for human detection, visual surveillance systems can automatically compute the flux of people at important public areas such as stores and travel sites, and then provide congestion analysis to assist in the management of the people. In the same way, visual surveillance systems can monitor expressways and junctions of the road network, and further analyze the traffic flow and the status of road congestion, which are of great importance for traffic management.4) Anomaly detection and alarming. In some circumstances, it is necessary to analyze the behaviors of people and vehicles and determine whether these behaviors are normal or abnormal. For example, visual surveillance systems set in parking lots and supermarkets could analyze abnormal behaviors indicative of theft. Normally, there are two ways of giving an alarm. One way is to automatically make a recorded public announcement whenever any abnormal behavior is detected. The other is to contact the police automatically.5) Interactive surveillance using multiple cameras. For social security, cooperative surveillance using multiple cameras could be used to ensure the security of an entire community, for example by tracking suspects over a wide area by using the cooperation of multiple cameras. For traffic management, interactive surveillance using multiple cameras can help the traffic police discover, track, and catch vehicles involved in traffic offences.It is the broad range of applications that motivates the interests of researchers worldwide. For example, the IEEE has sponsored the IEEE International Workshop on Visual Surveillance on three occasions, in India (1998), the U.S. (1999), and Ireland (2000). In [1] and [2], a special section on visual surveillance was published in June and August of 2000, respectively. In [3], a special issue on visual analysis of human motion was published in March 2001. In [4], a special issue on third-generation surveillance systems was published in October 2001. In [5], a special issue on understanding visual behavior was published in October 2002. Recent developments in human motion analysis are briefly introduced in our previ- ous paper [6]. It is noticeable that, after the 9/11 event, visual surveillance has received more attention not only from the academic community, but also from industry and governments.Visual surveillance has been investigated worldwide under several large research projects. For example, the Defense Advanced Research Projection Agency (DARPA) supported the Visual Surveillance and Monitoring (VSAM) project [11] in 1997, whose purpose was to develop automatic video understanding technologies that enable a single human operator to monitor behaviors over complex areas such as battlefields and civilian scenes. Furthermore, to enhance protection from terrorist attacks, the Human Identification at a Distance (HID) program sponsored by DARPA in 2000 aims to develop a full range of multimodal surveillance technologies for successfully detecting, classifying, and identifying humans at great distances. The European Union ' s Framework V Programme sponsored Advisor, a core project on visual surveillance in metrostations.There have been a number of famous visual surveillance systems. The real-time visual surveillance system W~4 [7] employs a combination of shape analysis and tracking, and constructs models of people ' s appearances in order to detect and track groups of people as well as monitor their behaviors even in the presence of occlusion and in outdoor environments. This system uses the single camera and grayscale sensor. The VIEWS system at the University of Reading is a three-dimensional (3-D) model based vehicle tracking system. The Pfinder system developed by Wren et al. [8] is used to recover a 3-D description of a person in a large room. It tracks a single nonoccluded person in complex scenes, and has been used in many applications. As a single-person tracking system, TI, developed by Olsen et al. [9], detects moving objects in indoor scenes using motion detection, tracks them using first-order prediction, and recognizes behaviors by applying predicates to a graph formed by linking corresponding objects in successive frames. This system cannot handle small motions of background objects. The system at CMU [10] can monitor activities over a, large area using multiple cameras that are connected into a network. It can detect and track multiple persons and vehicles within cluttered scenes and monitor their activities over long periods of time.As far as hardware is concerned, companies like Sony and Intel have designed equipment suitable for visual surveillance, e.g., active cameras, smart cameras, omni-directional cameras, etc.All of the above activities are evidence of a great and growing interest in visual surveillance in dynamic scenes. The primary purpose of this paper is to give a general review on the overall process of a visual surveillance svstem, and to briefly show the methods on Moving Object Classification.Different moving regions may correspond to different moving targets in natural scenes. For instance, the image sequences captured by surveillance cameras mounted in road traffic scenes probably include humans, vehicles and other moving objects such as flying birds and moving clouds, etc. To further track objects and analyze their behaviors, it is essential to correctly classify moving objects. Object classification can be considered as a standard pattern recognition issue. At present, there are two main categories of approaches for classifying moving objects.Different descriptions of shape information of motion regions such as points, boxes, silhouettes and blobs are available for classifying moving objects. VASM [11] takes image blob dispersedness, image blob area, apparent aspect ratio of the blob bounding box, etc, as key features, and classifies moving-object blobs into four classes: single human, vehicles, human groups, and clutter, using a viewpoint-specific three-layer neural network classifier. Lipton et al. [10] use the dispersedness and area of image blobs as classification metrics to classify all moving-object blobs into humans, vehicles and clutter. Temporal consistency constraints are considered so as to make classification results more precise. Kuno et al. [12] use simple shape parameters of human silhouette patterns to separate humans from other moving objects.In general, nonrigid articulated human motion shows a periodic property, so this has been used as a strong cue for classification of moving objects. Cutler et al. [13] describe a similarity-based technique to detect and analyze periodic motion. By tracking an interesting moving object, its self-similarity is computed as it evolves over time. As we know, for periodic motion, its self-similarity measure is also periodic. Therefore time-frequency analysis is applied to detect and characterize the periodic motion, and tracking and classification of moving objects are implemented using periodicity. In Lipton ' s work [14], residual flow is used to analyze rigidity and periodicity of moving objects. It is expected that rigid objects present little residual flow,whereasa nonrigidmoving object such as a human being has a higher average residual flowand even display a periodic component. Based on this useful cue, human motion is distinguished from motion of other objects, such as vehicles. The two common approaches mentioned above, namely shape-based and motion-based classification, can also be effectively combined for classification of moving objects. Furthermore, Stauffer [15] proposes a novel method based on a time co-occurrence matrix to hierarchically classify both objects and behaviors. It is expected that more precise classification results can be obtained by using extra features such as color and velocity.The outline of the paper is as follows. In Section 2, we introduce the framework of the visual surveillance in dynamic scenes. In Section 3, we show the general situation of moving object classification algorithms in video surveillance. In Section 4, we discuss the shape-based classification algorithms. In Section 5, we detail some motion-based classification algorithms. In the last section, we show some new classification algorithms.
Keywords/Search Tags:Classification
PDF Full Text Request
Related items