Research On Spatio-Temporal Feature Based Human Action Recognition

Posted on:2021-10-12

Degree:Master

Type:Thesis

Country:China

Candidate:J Q Miao

Full Text:PDF

GTID:2518306308968769

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of artificial intelligence,computer vision draws more and more attention due to its wide application prospects.Human action recognition is a research hotspot in the field of computer vision which uses computers instead of human eyes to recognize human actions from videos.It is widely applied to many scenarios including video understanding,human-computer interaction,intelligent surveillance and so on.Human action recognition mostly adopts RGB videos collected by 2D cameras and skeleton data captured by 3D cameras for recognition which contain rich spatio-temporal features.Hence,how to design an algorithm model to extract these spatio-temporal features sufficiently and accurately is a key to improve the accuracy of human action recognition.specific contents of the thesis are as follows:Aiming at the problem of inaccurate spatial feature extraction caused by complex background and moving camera in RGB video,the thesis proposes a Pose Mask Spatio-temporal Network(PM-STN).In spatial feature extracting,PM-STN uses a Pose Mask to fuse with the original image to focus on the key spatial features of human body which improves the accuracy of feature extraction of the network.In temporal feature extracting,the effects of different temporal network structures on Pose Mask are studied and an architecture with both Convolutional Neural Network and Long Short Time Memory is designed to fully exploit its spatio-temporal feature extraction ability.Experimental results on multiple benchmarks show that PM-STN achieves state-of-the-art performance in human action recognition.In order to solve the problem that the existing 3D skeleton spatio-temporal feature extraction methods are limited to local feature extraction,which leads to the lack of high-level feature representation ability,the thesis proposes a temporal-aware graph convolution network.In terms of spatial feature extraction,the network’s ability to extract high-level spatial features is enhanced through an improved global human topology representation.In terms of temporal feature extraction,the thesis introduces a global memory unit which expands the receptive field and selectively extracts temporal features from skeleton sequences to make up for the deficiency of high-level feature extraction.Experiments conducted on the open dataset show that the method achieves higher accuracy compared with the state-of-the-art methods.

Keywords/Search Tags:

human action recognition, spatio-temporal feature, graph convolutional, network long short time memory

PDF Full Text Request

Related items

1	Research On Human Action Recognition Based On Spatio-temporal Graph Convolutional Neural Network
2	Research On Human Behavior Recognition Method Based On Graph Convolutional Networks
3	Research On Video Action Recognition Algorithm Based On Spatio-Temporal Features With 2D Convolutional Neural Networks Framework
4	Research On Human Action Recognition Based On Skeleton Features
5	Human Skeletal Action Recognition Based On Deep Learning
6	Research On Human Skeleton Action Recognition Method Based On Graph Convolutional Network
7	Research On Video Action Recognition Based On Improved Long Short-term Memory Network
8	Video Action Recognition Based On Multi-Stream Network Architecture
9	Research On Human Action Recognition Algorithm Based On Spatio-temporal Graph Convolutional Network
10	Research On Human Action Recognition Method Based On Improved Three-stream Network