Font Size: a A A

Visual Recovery And Recognition Of Human Motion

Posted on:2012-07-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:1118330362458300Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Human motion analysis is an important and fundamental research topic in computer visioncommunity. The principal mission is to automatically detect, recover and understand human mo-tion from visual observation. Wide range of applications involve human computer interaction,surveillance, virtual reality, markerless motion caption, sport performance analysis and so forth.This thesis focus on two important issues of human motion analysis, 3D human motion recoveryand human action recognition. The main contributions of our studies are summarized as follows.(1) To eliminate the ill condition introduced by non-calibrated monocular camera and avoidthe curse of dimensionality, a novel generative algorithm, hierarchical Annealed Genetic Algorithm(HAGA) is proposed within the framework of evolution computation. HAGA provides a generaloptimal searching strategy, which is specially designed for hierarchical state space. For 3D humanmotion recovery, we firstly extract hierarchical subspace of human motion from motion capturedata. In so doing, not only the space dimension is largely reduced but also the priors about humanmotion can be extracted e?ciently. HAGA combines the power of both simulated annealing andgenetic algorithms. It can reduce the computational complexity of generative approach, at the sametime significantly improve the accuracy of pose estimation.(2) Within the discriminative framework, we propose a novel corner point based local featurerepresentation for pose estimation. The feature encodes the information of body parts location,appearance and local structure simultaneously. It can capture the spatial concurrence and contextinformation of local structure, therefore it's partly invariant to the changes of illumination andlocation. In this part of work, we also consider a system of human pose estimation, which integratethe feature, algorithm and sensor configuration into a whole framework. We quantitatively evaluatethe impacts of quality and quantity of visual information, algorithm and sensor combination, to theperformance of whole system. It can provide valuable guidance and reference for the design of 3Dmarkerless motion capture system.(3) To handle multi-modality of state posterior distribution, we propose a novel mixtureexperts model, Temporal-Spatial Combined Local Gaussian Process Expert Model (TSC-LGP),which divides the whole state space into local regions and each region is dominated by a localGaussian Processes expert. The model is defined in a unified input-output space, therefore can handle bi-directional multi-modality distribution. We construct temporal and spatial local expertssystem simultaneously in order to integrate more discriminative information together. Specially,the temporal GP experts not only can eliminate multi-modality, but also can explore potential con-text information contained in the output space.(4) To increase the visibility of the state-of-the-art from experimental perspective, we com-prehensively evaluate the diverse approaches within the generative and discriminative frameworks.The evaluation adopts same error criterion and databases. To conducts the evaluation, we spe-cially construct a large volume of database, which include synchronized motion capture and videostreams. It's quite valuable to the study of pose estimation and action recognition. To our knowl-edge, this evaluation is the most comprehensive quantitative evaluation to the stat of the art of poseestimation. It will be significantly beneficial to the research of 3D human motion recovery.(5) We propose a novel approach to human action recognition, based on sparse coding andlocal temporal-spatial features. By sparse coding, a general code book is constructed by usingdiverse source data, therefore the feature representation is quite discriminative. Because sometimesthe semantic information is much easier to get from text and caption information than other visualobservation, to extract more additional semantic information for action recognition, we performtext and caption detection in images and video clips. We design a novel feature representation fortext and caption detection based on corner points. Its performance is well demonstrated by starchallenge multimedia competition.
Keywords/Search Tags:Human motion analysis, markerless motion capture, human actionrecognition, generative model, discriminative model, Gaussian processes, text detec-tion, caption detection, sparse coding, hierarchal annealed genetic algorithm, visualfeature
PDF Full Text Request
Related items