Human pose estimation network realizes detection and localization of human key points in the target,which is the focus and hotspot of computer vision research.In recent years,benefiting from the rapid update and development of deep learning-related hardware and algorithms,human pose estimation networks have been widely used in pedestrian detection,human-computer interaction,and pedestrian re-identification.However,with the continuous improvement of the detection accuracy of key points in the human pose estimation network,the traditional network has problems such as high computational complexity and large amount of parameters.It provides a new idea for solving the problems existing in the traditional network.Based on the lightweight technology and attention mechanism,this paper aims to optimize and improve the network structure,reduce the amount of parameters of the network,and build a lightweight and high-precision human pose estimation network under the condition of maintaining the network detection accuracy.The following are the main research contents of this paper:(1)The fusion innovation of human pose estimation based on Sandglass module and attention mechanism is realized.First of all,the network model based on HRNet uses lightweight modules to replace the standard convolution operation.While ensuring the performance of the network model,it reduces the computational complexity and parameter amount during training and better improves the convolution operation.The feature learning ability of the neural network;secondly,the attention mechanism is used to highlight the features of small scale and occluded human key points in the input image,and the performance of the network model to detect all key points in the input image is improved;finally,on the same dataset,the experimental the results are compared with the current popular network models.The results show that the proposed network can still maintain high recognition accuracy for occlusion and small-scale key points under the premise of less computational complexity and parameter quantity.(2)Realize the innovation of human pose estimation based on the inverted residual module and attention mechanism.First,a multi-scale preprocessing module is introduced in the early stage of the network to strengthen the feature extraction of feature maps in the early stage;secondly,based on the introduction of the inverted residual module,the basic module of the network model is improved,and the scaling factor is adjusted to reduce the network.The computational complexity and parameter amount of the model;then the attention mechanism is introduced to strengthen the network model to capture long-range dependencies and accurate spatial position information in the spatial direction on the premise of ensuring the acquisition of information between feature channels,so as to alleviate the inability to recover the space during the upsampling process.The defects of localization ability;finally,multiple training and ablation experiments are carried out on the training set to further optimize the human pose estimation network,and visualization experiments are carried out on the validation set to verify that the improved network model is robust to constructing correct human poses.Awesome.(3)Realize the visual experiment applied in the actual scene.The two improved lightweight network models are applied to the actual scene,and the images at the same time are captured from different videos with HRNet respectively.As the visual experimental results of three different network models,the effectiveness of the improved lightweight network model in the actual scene is verified from the perspective of key point detection accuracy and application feasibility. |