Human Pose Estimation Based On Deep Learning

Posted on:2023-02-05

Degree:Master

Type:Thesis

Country:China

Candidate:H W Pei

Full Text:PDF

GTID:2558307073991199

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Human pose estimation is a fundamental technique for computer vision,with applications for several tasks such as action/activity recognition,action detection,human tracking,humancomputer interaction,video surveillance,movies and animation,virtual reality,medical assistance,sports motion analysis,etc.In the past decade,human pose estimation has made substantial progress based on deep learning.While the variety of human outlines,limbs overlap,keypoints occlusion,and crowded people are the main factors that lead to incorrect detection or ambiguous classification of joints.For producing discriminative contextual information,while they require a lot of computing resources,the deeper convolutional neural network(CNN)and high-resolution representations are considered as means to suppress the systematic error.Efficient and accurate methods are the primary requirements for human pose estimation.Towards those problems,the feature extraction and fusion methods in the existing pose estimation network are analyzed deeply,and a lightweight framework is proposed.The main contributions of this thesis are the following.Firstly,this thesis revisits that it is worth using cascaded dilated convolution for human pose estimation tasks to obtain multi-scale features at the same spatial size.After analyzing local information loss of the cascade framework,the cascaded residual dilated convolution(CRDC)is proposed to strengthen the information stream for involving precise location context.As a plug-and-play module,the CRDC,with a group of small dilation rates,captures multi-scale contextual information and mixes features to predict human keypoints at a very low computation cost.Secondly,in order to enable the efficient application of network models in limited computing resources,this thesis proposes a lightweight unified framework: Bilateral Pose Architecture(Bi Pose).One branch of the architecture extracts low-level spatial location information with low computational resources,the other branch uses a lightweight backbone extraction network to generate high-level semantic information.And a special fusion module is designed for features from different branches.The architecture increases inference speed while maintaining the model’s predictive accuracy as much as possible.Finally,for different hardware resource constraints,different lightweight networks are designed to meet different computing resource requirements,based on ordinary convolution,separable convolution,dilated separable convolution,and micro-factorized convolution.At the same time,a complex activation function is introduced to suppress the performance degradation caused by reducing parameters and computation.And a pose correction loss also is proposed to supervise the network to recognize poses and correct wrong strange poses.

Keywords/Search Tags:

2D human pose estimation, atrous convolution, separable convolution, multi-scale feature extraction, deep learning

PDF Full Text Request

Related items

1	Research On Human Pose Estimation Algorithm Based On Deep Learning
2	Real-Time Multi-Persons Pose Estimation In Complex Scenes
3	Research On Human Pose Estimation Method Based On Improved Deep Neural Network
4	Research On Lightweight High-resolution Human Pose Estimation Network Based On Attention Mechanis
5	Single-stage Human Pose Estimation Based On Deep Learning
6	Research On Human Pose Estimation And Recognition Technology
7	Research And Implementation Of Human Pose Estimation Based On Multi-scale Fusion And Graph Convolution Network
8	Research On Human Parsing Method Based On Deep Convolutional Neural Network
9	Research On Human Pose Estimation Algorithm Based On Deep Learning
10	Research On Object Detection Based On Atrous Convolution And Edge Guidance