Font Size: a A A

Research On Top-down Multi-person Pose Estimation Algorithm

Posted on:2021-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:D X ChenFull Text:PDF
GTID:2428330611499746Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Human pose estimation is a basic research in computer vision.It plays a fundamental role in other related researches of computer vision,such as action recognition and human tracking.Human pose estimation can be divided into singleperson task and multi-person task.Due to the images or videos obtained in the actual application scene often include more than one person,so multi-person pose estimation can better meet the needs of these applications.In the multi-person pose estimation task,the overlap or interference between multiple human bodies in images cause the task complex.How to do pose estimation for all human in the images correctly is a very challenging subject.In the multi-person pose estimation task,the current algorithms can be mainly divided into two categories,namely,bottom-up and top-down.The bottom-up methods detect all the keypoints of the human bodies in the image first,and then cluster these keypoints to form different individual poses.The top-down methods detect the human body first,then crop the person from the image according to the human bounding boxes,and do single-person pose estimation for the cropped images.This paper mainly studies the top-down multi-person pose estimation algorithm based on deep learning,and focuses on the single human pose estimation part.There are often same local region of other human body in the images which is cropped from the full image according to the human bounding boxes.They are likeyly to cause confusion.The local information of the keypoints is relatively weak.Therefore,global information is needed to assist in distinguishing the local region.This paper uses a non-local module to make the neural network have global receptive field in the shallow layer to capture the global information better.In order to enable the network to extract features efficiently,this paper uses a dual attention mechanism module to make the model focus on the features that are highly relevant to the task of multi-person pose estimation,and get more details of the target that need to focus on and supress information that is not related to the current task.The human bodies have a certain structure.The relationship between different keypoints is different.The keypoints with strong correlation can provide useful information for each other's detection.There may be interference between the keypoints with weak correlation.This paper uses specific feature learning based on human body structure to effectively model the interrelationship between keypoints.Experiments on COCO dataset and MPII dataset verify the effectiveness and correctness of the improvement.On the COCO test set,the average precision of the improved method in this paper reaches 74.0,which improves the average precision by 2.5 compared with the baseline model.
Keywords/Search Tags:top-down, non-local module, dual attention module, specific feature learning
PDF Full Text Request
Related items