Font Size: a A A

Research On Human Action Recognition Algorithms Based On Graph Neural Network

Posted on:2021-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:G W ZhangFull Text:PDF
GTID:2428330611465347Subject:Signal and information processing
Abstract/Summary:PDF Full Text Request
Human action recognition is a very important and challenging subject in the computer vision.It involves a wide range of application scenarios,including human-computer interaction,security monitoring,autonomous driving,intelligent assistance,and so on.Traditional human motion recognition is mainly based on RGB video,but it has disadvantages such as large amount of calculation,easy to be affected by lighting conditions,and sensitive to background noise.Thanks to the maturity of human detection algorithms,more and more researches have focused on human motion recognition based on bones.However,due to the non-grid structure of joints and the high degree of freedom of bone joints,two difficulties have been brought to the recognition of human motion based on bones: one is that it is difficult to construct a suitable feature extraction structure on the bone node space;The other is that it is difficult to capture complex spatiotemporal interaction patterns between bone points.Based on this,this article mainly explores the task of human motion recognition based on bones.Aiming at the problems of skeleton-based human action recognition with many joint points,large degrees of freedom,and difficulty in learning,this paper proposes a multi-head attention mechanism in the spatial domain of skeletal points,and specifically proposes two optimizations to guide multi-head attention learning Constraints are spatial differentiation constraints and local continuity constraints.Through the analysis of the experiment and the visualization of multi-head attention,we proved that the proposed method can effectively capture the bone points that are most relevant to the action.At the same time,we have further developed the second-order features of the bone sequence to make the model better capture the multi-angle features of the action.At the same time,for the skeleton-based action sequences with complex action categories and large sequence length differences,the existing time-based convolution method is difficult to capture the long-term dependent features of the action.This paper proposes a long-shortterm memory neural network(LSTM)with embedded graph convolution model.The LSTM model is a commonly used sequence model.Compared with the RNN model,it can better solve the problem of gradient disappearance or gradient explosion when learning long sequences.And embedding graph convolution in the LSTM model can simultaneously learn the long-term and short-term spatiotemporal interaction features of action sequences.At the same time,we propose two bidirectional embedded graph convolutional LSTM model structures that allow the model to learn more complex spatiotemporal features.The results of comparative experiments show that our proposed model achieves better experimental results than models based on time-domain convolution.Finally,we propose a multi-task learning framework based on motion prediction.In human motion analysis,human motion recognition and human motion prediction are often studied as two independent tasks.In this paper,we will learn the two tasks that have strong correlation,such as action recognition and action prediction,and let the action prediction branch guide the action recognition model to focus on more action details.The experimental results prove that the introduction of the action prediction branch can further improve the model results.
Keywords/Search Tags:human action recognition, graph convolution network, multi-head attention, embed graph convolutional LSTM, multi-tasks learning
PDF Full Text Request
Related items