| Human Pose Estimation also known as Human Keypoints Detection,aims to locate joints of various parts of the human body from input images.It is widely used in fields such as human tracking,human motion recognition,and human-computer interaction.However,there are still some problems,such as complex posture,multi-scale problems,occlusion problems,lighting problems,Complex network models cannot be deployed to mobile devices,etc.This article mainly focuses on improving model detection capabilities and lightweight processing of large models.The work done is as follows:1)In human pose estimation,H-HRNet(High precision HRNet)is designed based on High Resolution Network(HRNet)to address the issue of low accuracy in keypoints detection caused by occlusion,light,and other factors.The feature fusion module incorporating attention mechanism is used to fuse the information extracted from each branch,fully utilizing the channel and spatial feature information of each branch,learning the importance of different branches,and improving the detection accuracy of human keypoints.After the feature fusion module,a posture tuning module is added to correct the previous prediction results.At the same time,a multi-level supervision mechanism is used to alleviate the problem of gradient vanishing and improve the prediction ability of the network.The use of coordinate encoding and decoding strategy reduces the error generated when mapping the keypoints position of the heatmap back to the original image,further improving the detection accuracy of the network.The designed H-HRNet model was validated based on two datasets,COCO and MPII,achieving accuracy of 76.1%and 90.6%,respectively.Compared with the baseline model HRNet,the accuracy of the model was improved by 1.7%and 0.4%,respectively,verifying the effectiveness of the H-HRNet algorithm.2)A lightweight human posture network s-MSPN(student MSPN)was designed based on the Multi-Stage Pose Network(MSPN)to address the problem of high computational complexity and numerous parameters in existing models for human pose estimation.For the problem of model accuracy degradation after lightweight,a dual teacher knowledge distillation training method is used.The dual teacher knowledge distillation method uses the H-HRNet and MSPN original network proposed in Chapter 3 as the teacher model,and the simple and lightweight L-MSPN(Lightweight MSPN)as the student model.Combining MSE and KL losses,the pose distillation loss is designed to maximize the knowledge learned by the teacher model and transfer it to the student model,improving the learning ability of the student model.The proposed model was validated based on two datasets,COCO and MPII,achieving accuracy of 69.6% and 88.6%,respectively.The number of model parameters decreased from 56.85 M to 4.31 M,verifying the effectiveness of the proposed model. |