Font Size: a A A

Research On 3D Human Pose Recognition Algorithm Based On Semantic Features

Posted on:2021-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:J LinFull Text:PDF
GTID:2428330623968334Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,three-dimensional human pose recognition has gradually become a popular research interest in computer vision,and has been widely used in advanced human-computer interaction,intelligent monitoring,and motion analysis scenarios.The focus of this research is how to accurately estimate the category of human bone points and their position in three-dimensional space through images or videos.However,due to the high degree of freedom and occlusion of human skeleton keypoints,traditional methods based on graphical structure models are not robust to them.With the major breakthroughs made by convolutional neural networks in various visual analysis fields,researchers have designed a large number of high-performance network structures,which has further promoted the development of 3D human pose recognition based on convolutional neural networks.However,most current methods based on convolutional neural networks have the following problems:(1)Lack of the diversity of human pose in 3D dataset.(2)The 3D human pose cannot be recognized due to occlusion.(3)The appearance of abnormal three-dimensional human skeleton keypoints due to ambiguity.Therefore,this paper proposes a 3D human pose recognition method based on semantic features.The main work of this article is as follows:(1)A two-stage 3D human pose recognition method is used.First,the 2D pose in the image is predicted,the classical heat map-based network structure Stacked Hourglass is simplified for two-dimensional human pose recognition,the semantic features of images are extracted and the amount of model training parameters is reduced.Then the coarse 3D human keypoint position is regressed based on the prediction results.(2)The human limb prediction network and the human limb interaction network are proposed.The kinematic semantic of each human keypoint in the same torso and the geometric semantic of the interaction between different torsos are extracted,which strengthens the geometric constraints of the human body structure to solve the occlusion.(3)A two-stage reprojection network is proposed.The mapping semantic features of corresponding human keypoint pairs in different spaces and the semantic features of different coordinate axis structures in the same three-dimensional coordinate are extracted,and the position of the three-dimensional human keypoints is judged reasonably.(4)In order to facilitate the debugging of the model,a coarse-to-fine supervision strategy is adopted,a fine three-dimensional human body pose is predicted.Experimental results show that:(1)Although the recognition accuracy of the simplified two-dimensional human body pose recognition network is reduced,it can be applied to the three-dimensional human pose dataset through fine-tuning.(2)The performance of the proposed algorithm is better than that of the Baseline network.The average error of the three-dimensional position of human bone points on the Human3.6M dataset has been reduced from 62.9mm to 59.8mm.The maximum attitude category of the error is "sittingdown",dropped from 94.6 mm to 88.5mm.
Keywords/Search Tags:3D human pose recognition, convolutional neural network, semantic features, coarse-to-fine
PDF Full Text Request
Related items