Multi-Person Pose Estimation Based On Deep Learning

Posted on:2019-04-11

Degree:Master

Type:Thesis

Country:China

Candidate:J R Fan

Full Text:PDF

GTID:2428330548977452

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Multi-person pose estimation is of great importance in computer vision,which aim-s to locate the keypoints of each person in the image or video.Human pose estima-tion serves as the basis of action recognition,person re-identification,costume analysis,human-computer interaction,etc.It also has wide applications in different fields,in-cluding video surveillance,self-service stores,automatic drive,motion sensing games,virtual reality,and so on.It is inspirational for other similar tasks of keypoint localization as well.In recent years,deep learning methods,especially convolutional neural networks(CNNs),have achieved great success in various fields of computer vision,for example,image classification,semantic segmentation and object detection.More and more at-tention is being paid to deep learning models with powerful learning capacity.In this thesis,we propose two different deep learning methods for multi-person pose estimation in images and videos respectively.1.For multi-person pose estimation in images,the information of other people in the bounding box is usually confusing and may mix up with the main person in the front,which is a crucial problem in top-down methods.To address this prob-lem,we propose an attention and instance segmentation network to extract better features for multi-person pose estimation,making the model concentrate on the important area and ignore interference information.The competitive experimen-tal results on COCO benchmark demonstrate the effectiveness of our model.2.As to multi-person pose estimation in videos,temporal and spatial association-s could provide helpful cues to predict the positions of human pose keypoints.We,therefore,propose a spatio-temporal ConvLSTM(Convolutional Long Short-Term Memory)to refine the heatmaps of keypoints in unconstrained videos.More-over,optical flow and part affinity fields are combined separately to provide addi-tional temporal and spatial relationships between keypoints.The state-of-the-art results on PoseTrack dataset show the effectiveness of our method.

Keywords/Search Tags:

multi-person pose estimation, unconstrained video, deep learning, CNNs, attention mechanism, instance segmentation, spatio-temporal ConvLSTM, optical flow, part affinity fields

PDF Full Text Request

Related items

1	Human Action Recognition Via Dual Spatio-temporal Network Flow And Attention Mechanism Fusion
2	Video Action Recognition Based On 2D Convolution Network Under Spatio-Temporal Feature Enhancement Mechanism
3	Spatio-temporal Attention Model For Video Captioning
4	Research On Video Multi-object Segmentation Algorithm Based On Multi-temporal And Multi-level Attention Network
5	Study On Motion Estimation And Moving Object Segmentation In Object Based Video Applications
6	Research On Human Pose Estimation Algorithm Based On Deep Learning
7	Research On Person Re-Identification Based On Deep Learning
8	Real-time Human Posture Estimation System Based On Deep Learning Study
9	Research On Algorithm Of Human Action Recognition Based On Video
10	Research On Video Behavior Classification Technology Based On Spatio-Temporal Features