Research On Action Recognition In Video-Skeleton Sequences Based On Deep Learning

Posted on:2020-08-06

Degree:Master

Type:Thesis

Country:China

Candidate:J Wu

Full Text:PDF

GTID:2428330590474645

Subject:Mechanical and electrical engineering

Abstract/Summary:

PDF Full Text Request

Human action is a direct and effective way in the vision-based human-robot interaction.However,human action is a complex three-dimensional signal,and it is still difficult to get efficient and stable recognition in complex scenes.Aiming at the action recognition problem,this paper extracts the action spatiotemporal features from videos,human skeleton sequences and the fusion of them,then uses convolutional neural networks to identify the classification.In general,the research content of this paper mainly includes the following aspects:Video based Two-Stream CNNs for action recognition algorithm.Aiming at the problem of slow calculation of dense optical flow in existing Two-Stream CNNs,an endto-end model is proposed in the process of training and recognition.It contains two streams,spatial and global temporal stream,to characterize and recognize action.Based on the MobileNetV2,the spatial stream learns features from action images,the global temporal stream learns features from the Energy Motion History Images(EMHI),and then fuse them.Finally a multi-frame fusion method is used to improve the accuracy.Skeleton based action recognition algorithm with convolutional neural networks.The CNNs based on video is less robust to scene changes and and it can not recognize at night.A real-time action recognition system based on skeleton sequence is proposed.Perform a view invariant transformation on the human skeleton sequence to eliminate the influence of the viewpoint.Then the sequence is encoded into the RGB space with the original spatial structure information and temporal dynamic information.Finally,a lightweight CNNs is designed to identify the encoded RGB image.Multi-data based temporal action detection algorithm.Innovatively transforms the temporal action detection(TAD)problem into the one-dimensional object detection problem.A Two-Stream network based on YOLO is proposed.The input of the network combines the video and skeleton sequence from Kinect.In the video stream,the C3 D feature extractor is used to extract high-dimensional features of short-term video.In the skeleton stream,view invariant transformation is performed on the skeleton sequence.The high-dimensional features of the two streams are encoded as input to the Two-Stream object detection networks,and finally two methods are designed to fuse them.

Keywords/Search Tags:

Action Recognition, Convolutional Neural Networks, Temporal Action Detection, Object Detection

PDF Full Text Request

Related items

1	Research On Human Action Recognition In Videos
2	Temporal Convolutional Network Based Temporal Action Detection
3	Research On Temporal Action Detection In Video
4	Algorithm Of Complex Action Recognition Based On Temporal Proposals
5	Exploiting Spatio-Temporal Relationships For Video Action Recognition And Detection
6	Research On Temporal Action Location Method Combining Light And Heavy Networks In Untrimmed Video
7	Research On Algorithm Of Temporal Action Detection
8	Research And Implementation Of Video Action Search System Based On Temporal Action Detection
9	Deep Feature Modeling For Human Action Recognition And Detection
10	Temporal Action Localization And Action Recognition Based On Deep Learning