Font Size: a A A

Research On 3D Human Joints Estimation And Pose Recognition Based On 2D Images

Posted on:2024-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:G T LiuFull Text:PDF
GTID:2568307181952439Subject:Master of Engineering (Electronic Information Field) (Professional Degree)
Abstract/Summary:PDF Full Text Request
Human pose estimation technology is one of the most complex and challenging research direction in the field of computer vision,has tremendous application value,which can be applied to human-computer interaction,intelligent security,sports rehabilitation,virtual reality and other fields,attracts the attention of many researchers.With the developing of deep learning research and the enormous potential has arose,human pose estimation technology based on deep convolutional neural network which uses RGB images as input gradually developed and a series of research results have achieved.However,when the target person in RGB images is affected by factors such as background interference,self-occlusion and environmental occlusion which causes the human torso cannot be fully showed,neural network model cannot locate the human joints accurately.As a result,neural network cannot accurately locate human joints,leading to a poor performance in accuracy and robustness of human pose estimation.Moreover,training with huge neural network models requires higher standard of hardware,increasing costs while affecting real-time performance and efficiency.At present,the accuracy of 3D human pose estimation based on two-stage strategy depends on the accurate estimation of 2D human joints and the accurate mapping from 2D space to 3D space.2D human joints estimation methods mostly rely on the heatmap generated by the feature information(inter-layer feature)from different levels,which ignore the rich local information at the same level.Though bottom-up methods adopted in 2D multiperson pose estimation guarantees real-time performance,they still show some obstacles,such as incorrect grouping of joints and the loss of occluded joints.The 3D human pose estimation method with 2D joints coordinates information as input can not guarantee the estimation accuracy of 3D joints because of the small amount of feature data carried by 2D joints information.Therefore,this paper based on two-stage strategy,proposes a research method of 3D human joints estimation and pose estimation with 2D images as input.The main research contents and tasks are as follows:(1)A convolutional network structure based on intra-layer feature fusion was proposed.This model can fully extract and fuse local information contained in images to obtain fine local features by the parallel structure of convolutional blocks.Moreover,through continuous down-sampling and up-sampling steps,different levels of convolutional blocks can extract local information of images with different resolutions and fully fuse them,so as to improve the estimation accuracy of 2D joints in complex cases such as occlusion and the robustness of model.The validity of the network was verified on the LSP dataset and its extended dataset,estimation accuracy has reached 90.3%.(2)An end-to-end training method of embedding tags for joints estimation and grouping was proposed.Based on the intra-level feature fusion network,by adding an additional set of heat maps(which equal to the number of joints)to represent the labels of different nodes belonging to different human bodies,and adjusting the final output joints grouping through the full connection layer,the accurate allocation of the 2D human joints in the case of multiple people was realized.The validity of the method was verified on the MPII datasets,estimation accuracy has reached 89.7%.(3)A lightweight 3D human pose estimation method based on the internal connection of the joints was proposed.In this method,2D joints coordinate information was used as the input of network,a lightweight neural network consists of several cascaded fully connection layers was combined with deep feature fusion convolutional blocks to fully extract the feature information between the joints and accurately predict the 3D joints coordinates,which improved the estimation accuracy and greatly reduced the number of parameters in the network model.The validity of the network model is verified on the Human 3.6M dataset,the joint estimation error was reduced to 40.6mm,the amount of model parameters remains at 4.31 MB.At last,the proposed 2D joints estimation method was used to process the image from the Human 3.6M dataset to obtain the 2D joints coordinates.The proposed 3D joints estimation method was used to predict the 3D joints coordinates used the 2D joints information as input,the estimation error reached 60.6mm,and the results were evaluated and analyzed.
Keywords/Search Tags:Human pose estimation, Intra-level feature fusion, Embedding tag, Internal connection of the joints, Lightweight neural network
PDF Full Text Request
Related items