Font Size: a A A

Human Pose Estimation And Its Application Based On Monocular Camera

Posted on:2021-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y GuoFull Text:PDF
GTID:2518306512987219Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
In recent years,with the popularity of high-definition video capture devices in daily life,the number of monocular pictures and videos has grown rapidly.How to process this data and obtain valuable information from it becomes an important issue in the field of computer vision.Researchers focus on character's posture and behavior in the data due to its huge potential commercial value.In recent years,deep convolutional neural networks have brought huge breakthroughs in many tasks in the field of computer vision.People-related tasks such as pedestrian detection,human pose estimation,and action recognition have also received more and more attention.This paper focuses on two-dimensional human pose estimation as well as three-dimensional human pose estimation based on monocular RGB image data,and the application of two-dimensional human pose estimation in the pedestrian action recognition task.The main research contents are as follows:(1)A two-dimensional human body pose estimation method combining human parsing information is proposed.Nowadays it remains a challenging task that performing human pose estimation in complex scenes.Human parsing is a task that closely related to human pose estimation,which can provide very valuable information for pose estimation to help the pose estimation model to improve its performance.In this paper,a coarse-to-fine two-stream deep convolutional neural network model is proposed.The model first obtains rough predictions of human pose and human parsing through two neural network branches,and then uses a convolutional neural network to fuse human pose and human parsing features to obtain more precise predictions.Experiments show that the model can perform better in human pose estimation task by extracting assisted effective features.(2)A three-dimensional human pose estimation method combining human body structure prior is proposed.With a powerful deep convolutional network,researches on two-dimensional human pose estimation have achieved great improvements.However,three-dimensional human pose estimation is still a challenging task.The purpose of three-dimensional human pose estimation task is to obtain human three-dimensional coordinates from RGB images.Due to the ambiguity of 2D to 3D mapping,existing methods often fail to accurately predict 3D coordinates.In this paper,a coarse to fine model is designed to predict 3D coordinates step by step and combine the prior information of human body structure to design constraints,which guides the model to generate more reasonable prediction.Experimental results on the Human3.6M dataset show that our method is superior to current benchmark methods.(3)A pose guided pedestrian action recognition model is proposed.Pedestrians are important participants on streets,and identifying pedestrian movements is important for autonomous driving.This research first addresses the task of pedestrian action recognition,and then proposes the PARD data set for pedestrian action recognition.In order to solve this task,a deep convolutional neural network model leveraging multi-region attention mechanism is designed.The model processes video data in the driving scene,and pose prior is used to enrich the feature representation.The experimental results show that our method outperforms previous top performing action recognition methods such as P-3D,3D Resnet and LSTA on PARD dataset.
Keywords/Search Tags:Human Pose Estimation, Pedestrian Action Recognition, Deep Learning, Computer Vision
PDF Full Text Request
Related items