The Research On Video Action Recognition Based On Lightweight 3D Convolutional Neural Network

Posted on:2021-04-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y Chen

Full Text:PDF

GTID:2518306104988379

Subject:Computer application technology

Abstract/Summary:

With the popularity of the Internet and the development of smart cities,video resources are becoming increasingly abundant.Video action recognition has been widely concerned,its application scenarios include video surveillance,video audit and intelligent security.The latest research trend is to use 3D convolutional neural network for video action recognition.However,the extra time dimension greatly increases the computational load of the model,and ultimately it is difficult to apply the terminal equipment.On the other hand,video action recognition is more complicated than image recognition in which it requires the extracted action features to have overall coherence and saliency,which is to meet the intuitive feature of video action.In order to realize video action recognition in resource-constrained scenarios,the following innovative work was completed:(1)A 3D Xwise Separable Convolution is designed and a lightweight 3D convolutional neural network Xwise Net based on the 3D Xwise Separable Convolution is constructed.The main innovation is that the 3D Xwise Separable Convolution is based on the idea of separable convolution.It is a lightweight3 D convolution that extracts features independently on the channel dimension,time dimension and spatial dimension of the video,and compares it with an efficient backbone network framework.By combining the 3D Xwise Separable Convolution with an efficient backbone network framework,a lightweight 3D convolutional neural network Xwise Net is finally obtained.(2)According to the need for temporal global information,the Xwise Net is optimized based on the temporal global context.The specific work is to build a temporal global information module TGC Block and combine it with the Xwise Net to obtain the TGC-Xwise Net,which can establish a global dependency relationship,grasp the overall action state and key action points.Extensive experiments on three classic datasets(Kinetics-part A,Kinetics-part B,KTH)validate the effectiveness of the proposed algorithm in terms of lightweight and high accuracy.On three datasets,compared with most mainstream models,when the accuracy is equivalent,the parameter amount is reduced by more than 54.42%,and the calculation amount is reduced by more than 36.29%;The Xwise Net optimized based on temporal global context improves the accuracy of Kinetics-part A by 4.8%.

Keywords/Search Tags:

Lightweight, Deep learning, 3D convolutional neural network, Temporal global context, Action recognition

Related items

1	Research On Temporal Action Detection Based On Relation Aware And Global Context
2	Temporal Action Localization And Action Recognition Based On Deep Learning
3	Human Skeletal Action Recognition Based On Deep Learning
4	Research On Temporal Action Detection And Action Recognition Based On Deep Learning
5	Action Recognition Method Based On Sparse Auto-Combination Spatio-Temporal Convolutional Neural Network And Its MapReduce Implementation
6	Deep Feature Modeling For Human Action Recognition And Detection
7	The Research Of Video Action Recognition Algorithm Based On Deep Learning
8	Research On Behavior Recognition Method Based On Lightweight And Global Frequency Domain Poolin
9	Action Recognition Method Based On Multi-frequency Spatio-temporal Feature Learning
10	A Research Of Human Action Recognition Based On Deep Learning