Human Action Recognition Method Based On DenseNet And Multi-Scale Temporal Information

Posted on:2021-10-10

Degree:Master

Type:Thesis

Country:China

Candidate:L Jin

Full Text:PDF

GTID:2568306632966899

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

In the recent years,with the rapid development of computer vision technology,a series of technology such as object detection,position estimation,face recognition and action recognition have made great progress,and they were gradually applied to our life,which made our life more convenient.Among them,action recognition especially has a wide range of application values,it plays an important role in intelligent monitoring,human-computer interaction,video retrieval,automatic identification alarm,public safety and many other fields.Due to the complexity of human behavior in video,as well as a series of problems such as Interference from external background and camera shake,it become a great challenge to exploring a way to improve the accuracy of human action recognition in video.This paper makes an in-depth research on algorithms of human action recognition,the main work is as follows:First,we explored the application of 2D and 3D convolution in human action recognition respectively,including two-stream convolutional networks and C3D convolutional network.We verified the performance of these two networks on the UCF101 dataset.Then,for the calculation and storage of the optical flow characteristic consume too much resource,and the number of C3D convolution network layers is too small,we combine 3D convolution with 2D convolution to form a hybrid convolution network to improve the performance of the C3D convolutional network.Then,in order to get the deep features of the video,we refer to the DenseNet network structure,build dense connection blocks to establish a dense connection between layers,which achieve feature reuse and improve the efficiency of feature extraction.At the same time,the number of layers of the network is deepened,The nonlinear transformation in densely connected blocks adopts a hybrid convolution method,which improves the 3D convolutional layer’s capability to extract time information.Finally,considering that the motion of the characters in the video is not evenly distributed throughout the video,we refer to the Inception network structure,add 3D convolution kernels with different time depths in the transition layer to perform convolution operations in parallel.This design simulates a 3D convolutional layer with variable time depth,which can modeling the sequence video frames in the short,medium and long time,this ensures the network can capture important temporal information that is not captured at a stable time depth.Then transform transition layer will be named as multi-scale temporal transition layer,after replacing the multi-scale temporal transition layer with the original deep mixed convolution network based on DenseNet structure extension,the depth is not increased,the width is increased,and the recognition accuracy is improved significantly.After comparing with the current human action recognition methods,it is concluded that the solution proposed in this paper works best.

Keywords/Search Tags:

Human action recognition, 3D convolutional neural network, DenseNet network, Multi-Scale temporal transition layer

PDF Full Text Request

Related items

1	Temporal Action Detection Using Dense Dilated Convolutional Network
2	Human Action Recognition Based On Spatial-temporal DenseNet
3	Research On Human Action Recognition Based On Convolutional Neural Network
4	Research On Human Skeleton Action Recognition Based On Graph Convolutional Networks
5	Human Action Recognition Based On Convolutional Neural Networks
6	Research On Human Action Recognition Algorithm Based On Two Stream Convolutional Neural Network
7	Research On Video Action Recognition Algorithm Based On Multi-scale Spatiotemporal Feature Extraction
8	Action Recognition Method Based On Sparse Auto-Combination Spatio-Temporal Convolutional Neural Network And Its MapReduce Implementation
9	Research Of Video Action Recognition Based On Two-stream Convolutional Neurel Network
10	Action Recognition Of Human Skeleton Based On Spatio-temporal Graph Convolutional Neural Network