Font Size: a A A

Research On The Method Of Video Abnormal Behavior Event Detection Based On Deep Learning

Posted on:2020-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2438330590478615Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The task of video anomaly detection is to use a computer to identify video frames that are rare or different from normal behavior.It is a key technical challenge for a new generation of intelligent video analysis system.Nowadays,video anomaly detection has two problems,including fewer abnormal behavior video and ambiguous definition of abnormal behavior in different scenes.The main existing method is to model normal behavior first,and then video different from the normal model is defined as an abnormal event.Among these methods,learning video spatiotemporal features based on deep learning method shows promising prospects.In order to detect abnormal behavior events in different video surveillance scenes,according to the different crowd density,three abnormal behavior detection algorithms are proposed for high,medium and low crowd density.(1)A crowd abnormal behavior event detection algorithm based on crowd density is proposed to solve the appearance feature modeling problem in the severe occlusion scene.The proposed algorithm mainly includes crowd density estimation and crowd behavior modeling.First,a deep Convolution Neural Network(CNN)model with a Multi-scale Atrous Convolution Spatial Pooling(MsACSP)Block is designed for crowd density estimation.And the multi-scale feature extraction ability of the model is enhanced by the MsACSP.For crowd behavior modeling,a two-stream network based on the spatial feature and the motion feature of crowd density map is designed to detect crowd panic behavior.Besides,the local spatial-temporal dynamics of crowd density is used to detect crowd gathering behavior.Experiments validate the efficiency of the proposed algorithm,and the crowd density module of the proposed method achieves the state-of-the-art on Shanghai-Tech dataset.In the application of crowd anomaly event detection,the proposed crowd gathering behavior detection algorithm achieves an accuracy of 97.06% in the PETS2009 S3 dataset.(2)Considering the method of behavior modeling based on hand-crafted feature is not robust enough in the medium crowd density scene,we propose a AutoEncoder based deep model combined with 3D Convolutional Neural Network(3DCNN)and Convolutional Gated Recurrent Unit(ConvGRU)to learn the appearance and motion feature from space-time dimension for video anomaly detection.Firstly,the shallow 3DCNN layers are applied to encode local spatial information and short-term temporal information.Then long-term temporal and global spatial information is learned by ConvGRU Networks.The video reconstruction and prediction branch are designed to compute the reconstruction and prediction error,so that the prediction branch can make the encoder learn spatiotemporal information better.Besides,the regularization of adjacent frames in the loss function is designed to decrease the temporal error,which is introduced by video timing sampling in the training phase.Experiments on real anomaly datasets show the efficiency of the proposed method.(3)Because the number of pedestrians is small and the occlusion is not serious in the low-density crowd scene,and the behavior modeling method based on low-level feature has low accuracy and poor robustness.In this paper,a human violence detection algorithm based on Graph Convolution Network(GCN)is proposed to detect individual violence in the sparse scene.Firstly,a human Pose Sequence Generation(PSG)module is designed to extract the skeleton sequence of each individual in the video.The PSG module consists of a Multi-Object Tracking(MOT)framework and a Single Person Pose Estimation(SPPE)model.Secondly,the human Pose Sequence Adaptive Sampling(PSAS)method based on the confidence of joints is proposed to eliminate those pose sequences with low confidence from PSG module.Thirdly,the human skeleton sequence is fed to the action recognition model composed of multi-layers Spatial-Temporal GCNs(ST-GCNs),and the model can automatically learn the skeleton representation features of human behavior and classify the human action.Besides,a new Abnormal Action Detection(AAD)dataset in the real monitoring scene is introduced.Experiments show that the proposed method can identify violent individuals and achieves 90% accuracy in the AAD dataset.
Keywords/Search Tags:Video Abnormal Behavior Detection, Spatiotemporal Feature, Crowd Density, Violent Individual Recognition, Graph Convolution
PDF Full Text Request
Related items