Font Size: a A A

Human Motion Detection And Pose Estimation Based On Attention Mechanism

Posted on:2022-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ChengFull Text:PDF
GTID:2518306563978029Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Human motion detection and pose estimation is one of the most challenging research topic in the field of deep learning and computer vision.Its goal is to detect the human body in the input image or video frame and locate the key points of the human body.It widely used in intelligent monitoring,human-computer interaction,action recognition and other fields.This paper proposes attention mechanism based on content description features,human motion detection algorithm based on attention mechanism,and human pose estimation algorithm based on attention mechanism.The research points of this thesis are as follows:Firstly,the shallow content description features and deep high-level semantic features in convolutional neural network complement each other,but the existing methods do not to make full use of the shallow content description features.This paper proposes an attention mechanism based on content description feature.The attention mechanism models the content description features of the input image,extracts the multi-scale content description features,and adaptively selects and fuses the multi-scale content description features from different branches.Finally,the fused multi-scale content description features recalibrate the high-level semantic features of backbone network output through gating mechanism,selectively emphasize the effective information of high-level semantic features,and suppress useless information.The proposed attention mechanism based on content description features achieves a Top-5 error rate of 5.47% in Image Net classification dataset and an average accuracy of 41.8% in object detection task in MS-COCO dataset.Secondly,the backbone network of human motion detection algorithm usually uses the high-resolution feature representation generated by up sampling or deconvolution,which makes the prediction thermal map inaccurate in space.We propose a weighted feature fusion high-resolution network.Weighted feature fusion high-resolution network gives extra learnable weight to different scale feature representation,which makes the network learn the importance of different resolution feature representation.In view of the fact that the non maximum suppression uses the intersection and union ratio as the similarity measure of prediction box,which is easy to delete the correct prediction box and cause missing detection,this paper proposes a non maximum mechanism based on Manhattan distance,which uses the Manhattan distance between prediction boxes as the similarity measure.This paper proposes a non maximum mechanism based on Manhattan distance,which uses Manhattan distance between prediction boxes as the similarity measure.Based on the above improvements,the proposed algorithm achieves 60.8% AP of human motion detection in MS-COCO dataset.Finally,in order to solve the problem of large amount of parameters and computation in the existing human pose estimation algorithms,a complementary feature pyramid network is proposed.Firstly,the proposed feature fix bottleneck block is used in the network.By using hierarchical connection,the features of different receptive fields are fused into a single bottleneck block,which expands the receptive field range of each bottleneck block.In order to reduce the redundant gradient information in the process of network optimization and build a lightweight network,cross stage partial connection is introduced into the complementary feature pyramid network.The proposed attention mechanism of complementary feature fusion to adaptively select different levels of complementary information for fusion,so as to maximize the output of effective features in the network.Based on the above improvements,the proposed algorithm achieves an AP of 72.7% in the human pose estimation task of MS-COCO dataset.
Keywords/Search Tags:Deep Learning, Convolutional Neural Network, Attention Mechanism, Human Motion Detection, Human Pose Estimation
PDF Full Text Request
Related items