Font Size: a A A

Research And Implementation Of Human Pose Estimation Based On Multi-scale Fusion And Graph Convolution Network

Posted on:2023-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:S K XiongFull Text:PDF
GTID:2568306914977479Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology and mobile devices,2D images obtained through mobile devices have become important information for people to acquire.Human pose estimation is one of the basic tasks of computer vision,and it is the basic module of vision applications such as motion estimation,motion detection,human tracking,film and animation,and virtual reality.With the development of deep learning,the accuracy of human pose estimation has also been improved to a certain extent.However,in 2D images of multiple scenes,how to estimate the pose of the blocked human body;how to estimate the pose of the human body at various scales in the image;how to further improve the accuracy of the estimation algorithm are still the main problems faced by the 2D human body pose estimation.In response to the above problems,this paper has done a lot of research and research,and made the following research results:(1)Aiming at the problem that the existing multi-scale feature fusion model simply sum and concatenate multiple-scale features,and cannot fuse the features adaptively,this topic proposes a new attention mechanism model based on multi-scale fusion,which can be a more efficient use of high-level semantic information contained in highlevel features and precise location information contained in high-resolution layers further enables the model to utilize high-level semantic information to guide the generation of low-level semantics.The pose estimation of the occluded human body in human body pose estimation requires a large field of view of perception,and high-level semantic information has a large field of view,lacks positional accuracy,so using Highlevel semantic information to guide the low-level feature extractor to extract features can use high-level semantics and low-level semantics more efficiently.In addition,adding intra-layer fusion enables multi-scale extraction within finer layers.Experiments on two basic datasets show that the algorithm can fully integrate multiscale features to further improve the accuracy of human pose estimation.(2)Although the existing CNN-based human pose estimation models effectively take advantage of the large receptive field of high-level semantics and the highresolution of low-level semantics,the network does not explicitly learn the global relationship between features,which limits the model to capture key points.the ability to relate.To use the inherent relationship between human joint points to improve the estimation accuracy and estimate the occluded parts,this topic proposes a human pose estimation algorithm based on GCN.This algorithm mainly proposes the following two aspects:firstly,the one-dimensional heatmap is used as the final training and inference label of the model,so that the model can directly process the one-dimensional sequence without the loss of position information;the second is to use the one-dimensional sequence to directly process the advantage of using structures such as GCN or Transformer to explicitly learn the global relationship between features.Experiments on the COCO dataset show that the algorithm proposed in this chapter can significantly improve the accuracy based on the skeleton network.(3)Design and implement a human pose estimation system based on multi-scale fusion and graph convolution network.The system consists of three parts:algorithm module,back-end frame,and front-end frame.It has an efficient and concise operation interface and fast response speed,and can efficiently and accurately provide users with the annotation of human body joints.
Keywords/Search Tags:human pose estimation, multi-scale fuse, attention mechanism, GCN, heatmap encoding
PDF Full Text Request
Related items