| Two-dimensional human pose estimation is a basic computer vision task,its main task is to locate the keypoints(such as eyes,elbows,hands and feet,etc.)of individuals from images or videos,and describe the human skeleton information through the keypoints.It has a very wide range of applications,including character action recognition,human-computer interaction,intelligent photo editing,pedestrian tracking and other realistic scenes,so it has become a hot topic in the field of computer vision.In the process of pose estimation,the human pose estimation faces many challenges due to the different scale,appearance and quantity of the figures in the pictures,as well as the influence of various factors such as illumination,object occlusion and self-occlusion.In addition,human pose estimation also has problems such as complex network models,high computational power requirements of computers,and difficult deployment on edge devices.To solve the above problems,the following work is carried out in this paper.(1)A lightweight human pose estimation method with explicit anatomical keypoints structure constraints is proposed.This method introduces a topological constraint term consisting of the distance and direction between keypoints and the difference between their groundtruth in the loss object,which improves the accuracy of the original method.More importantly,our proposed models can be inserted into existing bottom-up or top-down human pose estimation methods and improve their performance.Numerous experiments on the COCO dataset show that our approach is superior to most existing bottom-up and top-down human pose estimation methods,especially for Lite-HRNet,where our module is inserted.Its AP score improved 2.9% on COCO val and 3.3% on test-dev.(2)A lightweight human pose estimation model based on dynamic multi-scale context fusion is proposed,which can capture the feature relationships between the same resolution and different resolutions using rich context information while reducing the amount of computation.Finally,the effectiveness of the proposed method was verified on the COCO val dataset,achieving an accuracy improvement of about 0.3% while reducing the parameter amount by about 0.5M.In order to fully test its performance,this article has designed and developed a multi-person human pose estimation system,which mainly includes three main functions,namely image processing,video processing and manual annotation.Users do not need to learn complex human pose estimation principles,and can complete the task of human pose estimation on pictures and videos through a clear visual interface. |