Font Size: a A A

Human Detection And Motion Recovery From Monocular Video

Posted on:2014-01-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:J H DingFull Text:PDF
GTID:1228330395489242Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays Human detection and motion recovery is one of the most active topics in artificial intelligence and computer vision community. The research have made a significant impact in many different disciplines, such as on-board driving assistance system, smart surveillance system, user interfaces, and motion analysis in sports training etc. The effective solutions to human detection issues can also facilitate other object detections. Capturing the human motion from the videos significantly reduces the cost of motion capture. Moreover, due to the wide range of the video sources, it also helps the users to find more underlying information.Currently human detection and motion recovery from monocular mainly face three challenges:1) Human data is greatly occupied by the noise. Besides, there exist a big amount of detecting window need to be classified in a single frame, however, most space in the frame is non-pedestrian;2) Human data and motion possesses a lot inner-class variation, including both pose variation and action variation;3) The depth information could be lost while the3D objects being projected onto a2D image. Furthermore, the detection results are easily affected by the complex background, occluding, lighting or changing from the appearance.This thesis is following the main line of human detection, skeleton extraction and motion recovery in monocular video. Inspired by recent key techniques of feature abstraction, machine learning and motion recovery, we present several improved approaches. The main work is summarized as follows:(1) We propose a mixed feature descriptor and fast detector for human detection. While facing the scene with complex background, various lighting or fast changing appearance, human detection system based on a single feature descriptor can not meet the requirements of efficient detecting and low false alarm rate. We improve the robustness of human detector by concatenating the histogram of oriented gradients (HOG) and the histogram of census transform (CT) of images. In view of the problem of the slow training time of AdaBoost learning method, we improve the training speed by optimizing quick feature selection and dual threshold value selection. In the respect of detecting speed, we improve the design of the classifier and modify the feature calculation for the target window. We train the cascade structure classifier, which excludes the non-human target by coarse-to-fine strategy, to reduce the detecting time. Moreover, we also proposed a target scanning approach based on "block" updating to speed up the process of detection. More specifically, when the scanning window is sliding on the test image, our approach only extracts the feature from the changed "blocks" and re-calculates the histogram distribution. Experimental results indicate that the improved human detector based on the combined feature descriptor and cascade AdaBoost can detect the human target from various resolution static images. We apply our approach on INRIA person dataset and compare it with the method of single feature descriptor. The results show that our approach is obviously better in both the performance and detecting speed.(2) We propose a human detection method based on multi-part and multi-instance learning method. The global-based sliding window detection approach neglects the non rigidness of human as the feature descriptor is extracted from a rectangle window. For this reason, the approach is not robust enough in some cases, for example when it has various pose, when the human target is partly occluded or when the view direction is changed. We propose an improved algorithm for part-based human detecting method. Specifically the training samples are partitioned into several regions which per contains multiple instances according to the real body structure. Then the part detectors are trained using multi-instance learning method based on AdaBoost algorithm. Part detectors are used individually to obtain the responding scores when predicting on the training bags. Therefore, the training samples are converted to low dimension feature vectors which are composed by part scores. The final assemble detector is learned using a linear SVM method. Experiments on INRIA dataset show that our approach improves the detection performance of the single instance learning method and can successfully detect partly-occluded human. Finally we evaluate the detection performance on the three different multi-part divisions.(3) We propose a markless and uninitialized method to restore human motion from monocular. The traditional markless approaches based on monocular need to manually set the initial position at the first frame. A lot of approaches based on probability model or learning methods pose some problems, such as high computation complexity and over depending on the training example databases. The proposed motion recovery method in this thesis has several steps:directly extracting skeleton from image, measuring the initial2D coordinates of joints and further recovering the original3D coordinates. More specifically, by using the gradient of distance transform of image, the linear skeleton of the human body is automatically extracted from monocular video images. The positions of joints are determined by using the pre-knowledge of the anthropotomy. Therefore, we avoid initializing the positions of joints to recover the3D position at the first frame. In the assumptions of the scaled orthogonal projection model, we recover the human motion. We calculate the scale factor of the entire body according to the length of the joints of a customized human skeleton model. This scale factor is later used for restoring the position of each bone. Finally, each bone is tracked through kalman filters. The present method does not have any special requirements of camera pre-setting, auxiliary equipment or data context.
Keywords/Search Tags:Human Detect, Combined Feature, Cascade AdaBoost, Multiple InstanceLearning, Part Detector, Skeleton Extract, Motion Recorvery
PDF Full Text Request
Related items