| In recent years,with the development and popularization of computer vision technology,human gesture recognition technology has been widely used in various fields such as game entertainment,film and television production,and motion analysis.In the task of human pose estimation,there are many classic models,which have achieved higher and higher accuracy in the prediction of key points of human pose,and also have high stability for prediction in complex scenes.However,these models are complex in structure,large in model parameters,occupy a lot of running memory,slow in calculation,and are difficult to deploy in actual scenarios;on the other hand,the current human pose estimation models are based on high-resolution input.In the environment,low-resolution human images are often obtained,which will greatly reduce the accuracy of previous models.This paper mainly focuses on lightweight models and low-resolution human images.The main research results are as follows:(1)Aiming at the problem of lightweight human pose estimation,a tree network model is proposed in this paper.The model uses the multi-tree structure to reuse the local feature map,which significantly reduces the number of convolution kernels in the convolution module;at the same time,this paper uses depthwise separable convolution and channel pruning technology to further reduce the amount of parameters of the model and calculation amount.Experiments on the public dataset of human poses show that compared with the existing models,the parameters of the model are reduced by 20-30 times,and the calculation amount is reduced by 10 times.At the same time,the model has considerable performance.(2)In view of the performance problem of lightweight models,this paper introduces knowledge distillation technology and proposes a training method based on knowledge distillation.This method is divided into two methods: static training and dynamic training.By using the prediction results of the teacher model,a new loss is formed.The function assists the training of the student network network,which greatly improves the accuracy of the model.The experimental results on the public dataset of human pose demonstrate the effectiveness of the training method.(3)Aiming at the problem of low-resolution human pose estimation,this paper improves the proposed tree-like network model,and forms a human pose estimation model that accepts multi-resolution input.The parameter sharing feature,combined with the multi-resolution loss function and the high-resolution auxiliary loss function,can achieve pose estimation for human images of various resolutions without increasing the amount of parameters and calculations,and can effectively improve the Model accuracy on low-resolution human pose estimation problems. |