Font Size: a A A

Two-Dimensional Full-Body Human Pose Estimation From Monocular Video Sequence

Posted on:2016-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y H DuFull Text:PDF
GTID:2308330461989205Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the popularity of videos and images’ generation tools, the number of videos and images shows explosive growth, and information obtained by human’s eye from images or videos has been unable to meet the actual demand. In recent years, the demand for intelligent analysis or recognize of videos or images is growing. How to automatically analyze, calculate or identify the information of videos or images by the algorithm has become a hot research area. In the last recent ten years, human pose estimation as a sub-field of computer vision has aroused the attention of researchers. Human pose estimation can solve many problems, such as gesture recognition, security monitoring, animation generation etc. Research on the human body pose estimation mainly focuses on:how to select the features of images or videos; how to locate the joints or major parts of the human body; select which kind of pose estimation method, based on model or machine learning.During the recent years, intensive research has been conducted in human pose estimation. However, most of the published schemes work on a single still image, and many of them conduct pose estimation for only the upper human body. There have been very few works on estimating human poses in video sequences. In particular, two-dimensional (2D) full-body human poses estimation in monocular video sequences is largely underrepresented in the research, to the best of our knowledge. But human body pose estimation based on monocular video sequence has vast potential applications. In this paper, we estimate two-dimensional human body poses in the monocular video sequence. First, for each frame in the video, we detect the human region using a support vector machine, and estimate the full-body human pose in the detected region using multi-dimensional boosting regression. For the human pose estimation, we design a joints relationship tree, corresponding to the full hierarchical structure of joints in a human body. The relationship tree decomposes a complex full-body human pose estimation problem into a set of local pose estimation problems which improves estimation performance. Further, we make a complete set of spatial and temporal feature descriptors for each frame. We set normalized distance between the parent and its son node in the joint relationship tree as the target feature. We set histogram of oriented gradient as the spatial feature. For motion feature, we first calculate the optical flows between two adjacent frames or among several adjacent frames, then we warp the previous frame to current frame by using optical flow. Finally we set the absolute frame difference image between warped image and the current frame as motion patch. Utilizing the well-designed joints relationship tree and feature descriptors, we learn a hierarchy of regressors in the training stage and employ the learned regressors to determine all the joint’s positions in the testing stage. As experimentally demonstrated, the proposed scheme achieves outstanding estimation performance.
Keywords/Search Tags:monocular video sequences, human pose estimation, joints relationship tree, spatial feature, motion feature, multi-dimensional boosting regression
PDF Full Text Request
Related items