Font Size: a A A

Research On Human Pose Estimation Method Based On Deep Learning

Posted on:2020-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZhangFull Text:PDF
GTID:2428330599959764Subject:Engineering
Abstract/Summary:PDF Full Text Request
The detection of human skeleton key points(human pose estimation)is one of the basic algorithms of computer vision.It is mainly used in human behavior prediction,automatic driving and other scenarios.How to detect human skeleton key points efficiently and accurately is the key problem now.In order to improve the accuracy of detection of human skeleton key points,deep learning is adopted as the mainstream solution.Although the current deep learning-based approaches have achieved great success in the field of the detection of human skeleton key points,the following problems are still existing.Firstly,these models have a large number of model parameters,which makes the training cost very large.Secondly,there is space for improvement in the accuracy of the model and the speed of detection.Finally,when deeper and more complex network models are trained on large-scale human pose data,the training efficiency of the models is too inefficient.Therefore,we focus on the above problems and propose a method for detecting human skeleton key points based on deep learning.It mainly includes the following three aspects.Firstly,aiming at the problem that the existing network models have too many parameters and the training cost of the models is too high,we design four stages of CPMs-Stage4 based on Convolutional Pose Machines(CPMs),which is the mainstream method of the detection of human skeleton key points and analyze the influence of four stage design patterns on the detection of human skeleton key points.The experimental results show that the parameters of the model are reduced by the four stage design methods,and the training cost of the model is reduced.Then,there is space for improvement in the detection accuracy and the detection speed of the model,the combination of CPMs and GoogLeNet is used to design GoogLeNet13-CPMs-Stage6 and GoogLeNet13-CPMs-Stage4 in this paper.We analyze the influence of different inception structures on detection accuracy.The experimental results show that compared with the detection effect of CPMs,the detection accuracy and detection speed of GoogLeNet13-CPMs-Stage6 has been improved by 1.023 times and 1.441 times,respectively.The detection accuracy and detection speed of GoogLeNet13-CPMs-Stage4 has been improved by 1.009 times and 1.559 times,respectively.In addition,the combination of CPMs and SqueezeNet is used to design a more streamlined SqueezeNet15-CPMs-Stage4.The experimental results show that the detection accuracy of the model is the same as CPMs' and the parameters of the model have been greatly compressed in a step-by-step manner by the fire module structure.The detection speed of the model has been improved by 1.794 times.Finally,aiming at the problem that the existing single-machine GPU has limited capacity expansion and the network model training efficiency is low,a multi-GPU data parallel method based on Caffe framework is designed to improve the training efficiency after analyzing the advantages and disadvantages of data parallelism and model parallelism in this paper.The experimental results show that on GPU cards with 2,3,4 nodes,the training speed is increased by 1.496 times,2.093 times and 2.983 times,respectively.Therefore,we could solve the problem that human skeleton key point detection models are trained difficultly on a single machine.
Keywords/Search Tags:Human Pose Estimation, Deep Learning, Convolutional Pose Machines, Parallelization
PDF Full Text Request
Related items