Font Size: a A A

Behavior Recognition Based On Deep Neural Network

Posted on:2022-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:X Q DingFull Text:PDF
GTID:2518306533494464Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In recent years,with the improvement of social informatization,video action recognition,as one of the representative tasks of computer vision,has been paid more and more attention by researchers because of its wide application prospects in the fields of intelligent surveillance,automatic driving,media analysis and robotics.At the same time,in the context of the vigorous development of deep learning and other technologies,a large number of researches on video action recognition based on deep neural networks have emerged.Although the predecessors have completed a lot of research work,there are still a lot of challenges.First of all,due to the connection between human action in video temporal and scene changes,how to make full use of the temporal context information and efficient temporal modeling of video is very important.Secondly,the training and inference calculation costs of deep action recognition models are very high at this stage,which hinders their further development and application.For example,a large number of depth models based on optical flow modeling are limited in their application in real scenes due to the high computational cost and poor real-time performance.In response to the above problems,this paper mainly studies action recognition based on deep neural networks,and improves and improves the existing two-stream convolutional network in terms of temporal relationship modeling and processing speed.The main contributions are as follows:(1)A spatiotemporal heterogeneous two-stream network structure is proposed,and Res Net and BN-Inception are used as spatiotemporal feature extraction networks to extract richer motion and appearance features.At the same time,spatial attention is introduced in the spatial network to enhance the salient features of the target and improve the efficiency of data expansion.In addition,the sparse temporal sampling strategy of TSN is used to perform longterm video modeling,so as to extract more spatio-temporal features and further improve the recognition accuracy.This paper verifies the effectiveness of the model on two general humanbased scene action recognition databases.(2)Propose a temporal distillation strategy and constraint temporal graph fusion reasoning algorithm.The first step is to use temporal distillation to capture short-term dynamics without the need to extract expensive optical flow.It runs fast in the test phase and can well meet the requirements of real-time applications.The second step is to design a constraint temporal graph fusion reasoning module to capture long-term temporal information.It introduces sparse constraints to the learned diagram,has a certain ability to highlight the discriminant moment,can calculate the temporal interaction between frames,and provide a good foundation for accurate action recognition.At the same time,a two-level action recognition framework is constructed to gradually capture short-term information and long-term dependence.This paper verifies the effectiveness of the model on temporal-related datasets.
Keywords/Search Tags:Deep learning, Action recognition, Knowledge distillation, Graph convolution network, Spatio-temporal modeling
PDF Full Text Request
Related items