Font Size: a A A

Multi-Person Pose Estimation Based On Deep Learning

Posted on:2019-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:J R FanFull Text:PDF
GTID:2428330548977452Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Multi-person pose estimation is of great importance in computer vision,which aim-s to locate the keypoints of each person in the image or video.Human pose estima-tion serves as the basis of action recognition,person re-identification,costume analysis,human-computer interaction,etc.It also has wide applications in different fields,in-cluding video surveillance,self-service stores,automatic drive,motion sensing games,virtual reality,and so on.It is inspirational for other similar tasks of keypoint localization as well.In recent years,deep learning methods,especially convolutional neural networks(CNNs),have achieved great success in various fields of computer vision,for example,image classification,semantic segmentation and object detection.More and more at-tention is being paid to deep learning models with powerful learning capacity.In this thesis,we propose two different deep learning methods for multi-person pose estimation in images and videos respectively.1.For multi-person pose estimation in images,the information of other people in the bounding box is usually confusing and may mix up with the main person in the front,which is a crucial problem in top-down methods.To address this prob-lem,we propose an attention and instance segmentation network to extract better features for multi-person pose estimation,making the model concentrate on the important area and ignore interference information.The competitive experimen-tal results on COCO benchmark demonstrate the effectiveness of our model.2.As to multi-person pose estimation in videos,temporal and spatial association-s could provide helpful cues to predict the positions of human pose keypoints.We,therefore,propose a spatio-temporal ConvLSTM(Convolutional Long Short-Term Memory)to refine the heatmaps of keypoints in unconstrained videos.More-over,optical flow and part affinity fields are combined separately to provide addi-tional temporal and spatial relationships between keypoints.The state-of-the-art results on PoseTrack dataset show the effectiveness of our method.
Keywords/Search Tags:multi-person pose estimation, unconstrained video, deep learning, CNNs, attention mechanism, instance segmentation, spatio-temporal ConvLSTM, optical flow, part affinity fields
PDF Full Text Request
Related items