Human Action Detection Research Based On Convolutional Neural Network

Posted on:2019-08-20

Degree:Master

Type:Thesis

Country:China

Candidate:D Y Zhou

Full Text:PDF

GTID:2428330545498917

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the introduction of high-definition video surveillance,intelligent monitoring systems based on human action detection technology have developed rapidly in smart cities,military security,and smart homes.At the same time,with the popularization of intelligent terminals and the development of mobile communication networks,a large number of short videos have rapidly emerged.This urgently requires the understanding of video content in order to better retrieve,classify,and review videos,and the subject of video is human action.The huge application prospect and economic value make human action detection quickly become a research hotspot in the field of computer vision.The traditional human action detection algorithm needs to design feature engineering according to specific actions,and the workload is huge and the robustness is not high.In this thesis,the convolutional neural network(CNN)is used to design specific network structures for short videos and uncut long videos,respectively,to improve the robustness,accuracy and practicality of the algorithm.For short videos,learn from object detection algorithm,this thesis proposes a human detection method with object detection and dynamic linking algorithm.In order to improve the detection accuracy,sequential frames are used as input to extract the timing information of the video,and the spatiotemporal fusion algorithm is used to obtain more robust features.Then design an effective dynamic linking algorithm to obtain human action sequences from the results of object detection.Finally,network training,verification,and comparison with previous research work on multiple human action data sets.The experiment verifies the validity of the object detection plus dynamic linking algorithm,while the sequential frames input and spatiotemporal fusion further improve the accuracy.For uncut long videos,a three-dimensional convolutional neural network with recurrent neural network(RNN)is proposed.Firstly,three-dimensional convolution is used to encode low-level features of the video,and then the cyclic memory module is designed to further extract temporal feature.Finally,action detection is implemented through the detection part.In the circular memory part,two parallel semantic constraint modules P(Proposal)and C(Classification)are designed.Through refined loss function design,video segment proposal and classification tasks are realized respectively.During the training,the weights of the semantic constrained part of the loss function are dynamically adjusted to speed up the training and improve the accuracy.Experiments show that compared with previous studies,the accuracy rate issignificantly improved.This shows the effectiveness of the proposed program,and also make the human action detection to a practical step forward.

Keywords/Search Tags:

human action detection, object detection, Convolutional Neural Network, spatiotemporal feature, recurrent memory module

PDF Full Text Request

Related items

1	Deep Feature Modeling For Human Action Recognition And Detection
2	Spatiotemporal Deep Neural Network For Video Salient Object Detection
3	Vision Analysis Of Human Motion Based On Convoulutional Neural Network
4	Research On Human Action Recognition
5	Human Action Interpretation Using A Compact DTA Framework
6	Research On Human Detection And Action Recognition Based On Convolution Feature Deformable Part Model
7	Research On Abnormal Behavior Of Human Body In Video Surveillance Based On Deep Learning
8	People Detection In Indoor Scene Based On Deep Learning Method
9	Research On Some Problems Of Object Detection Based On Convolutional Neural Networks
10	Human Skeleton Action Recognition Based On Deep Learning