Font Size: a A A

Research On Human Abnormal Behavior Detection Based On Deep Learning

Posted on:2019-11-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:L YouFull Text:PDF
GTID:1368330566498502Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of intelligent monitoring technology,video surveillance is widely deployed in densely populated areas such as schools,subways,roads,factories,communities and so on.They bring us security as well as challenges.How to efficiently detect human abnormal behavior from video is one of the challenges which are also one of the hotspots and difficulties in the field of computer vision.Video can be divided into ordinary video and video with distance information based on the camera.The proposed algorithms in this manuscrits try to detect abnormal behaviors from ordinary vedio.According to the video shooting distance,videos are devided into close-range ones and long-range ones.In close-range videos,the behviors of human beings are close to the cameral,leading to a large proportion in each frame.The focus of close-range videos lies in the movement of the human upper limb,especially in the hand movement.Scene information of these videos are absent because of the large proportion of the behaviors.On the contrary,long-range videos pay close atten to the behaviors of the whole body.There are rich scene i nfroamtion in long-range videos which improve results of human behaviors' detection and recognition.We will explain the work of this article from the following aspects.Fisrt of all,a skin segmentation algorithm based on stack autoencoders is proposed for close-range abnormal behavior detection.In the detection of close range abnormal behavior,the scene information will increase the amount of computation for the subsequent processing of the algorithm,instead of providing effective help.Therefore,a skin segmentation algorithm is used to remove the background of the moving target.The traditional skin color segmentation algorithms a re mainly based on the pixels' characteristics such as values in each color space and texture to establish the color statistics model.These characteristics,collected from indivisual pixel,can not fully represent changes in skin tone resulting from chang es in illumination,age and so on.Therefore,we propose a skin segmentation algorithm based on stacked autoencoders.The algorithm uses skin color blocks as the basic pr ocessing unit for training and testing.Experiments show that our algorithm achieves good segmentation results on several skin segmentation data sets.At the same time,we found that in order to reduce the data annotations difficulty,the general dataset only labels the skin of foreground people.Our algorithm not only successfully detects the skin information of foreground people,but also detects the skin info rmation of some background people.This feature will provide a powerful help in close-range abnormal behavior detection.Secondly,a close-range abnormal behavior detection algorithm is proposed which consists of background subtraction,palm location and tracking,energy detection and trajectory recognition.The illumination changes are caused by the movement of the upper limbs of a person.An illumination indexed skin segment ation algorithm is proposed to remove the background information.Skin pixels with a high confidence are collected by the stacked autoencoder model and classified according to their illumination values.Under each of these illuminations,a dynamic skin segmentation model is built on the pixels of that illumination.The model not only removes the interference information in the background,but also helps the follow-up algorithm to quickly locate the skin color area in the current image.G eometric features are used in the skin segmented images to locate the palm.Camshift is used to track the located palm or arm.The original optical flow energy model is to calculate the individual energies and the interaction energies between the moving regions on the whole dense optical flow image.However,we only calculate the energies of the traced palm or arm.After the energy detection,a improved dynamic time wrapping algorithm is used to recognize the trajectories.Thirdly,we improve a convolutional neural network(CNN)model for long-range behavior detectoin.A CNN consists of convolutional layers,pooling layers,fully connecting layers and the classification layers.The discriminative features of CNNs outperform the hand-crafted features in many classification tasks.But in the object detection task,the advantage of CNN is not so obvious.In this a rticle,the following improvements have been made to the CNN model for the object detection task.First of all,skip connections and context learning are used to fuse the local information and global information.In skip connections,the low level features are connected to the high level features at a ratio.This not only preserves the space information of the object,but also guarantees the leading position of the high level feature in the object detection task.Secondly,based on the fused features,a context pooling is added to the Ro I pooling parallelly.Features by these pooling methods are connected to fuse the local information and global information again.Finally,candidate region optimization is made to locate the objects and abnormal behaviors more precisely.The algorithm in this paper has achieved good results on data sets such as VOC and UCF.Fianally,a multi-stream CNN is utilized to detect and recognitze the abnormal behaviors in long-range videos.In the long-range videos,scene information has a significant mapping relationship with some behavior.In order to improve the results in long-range abnormal behavior detection and recognition by scene infromation,we combine scene recognition CNN with the two-stream CNNs to form our multi-stream CNNs.Some scholars have applied the scene recognition based on traditional hand-crafted features to the task of behavior detection and recognition.However,these hand-crafted features can only be applied to small datasets.Part of the researchers added the scene classification task to the branch of the behavi or detection network at the purpose of sharing the pre convolution features and reducing the time of network training.However,the attentions of the object features and the scene features are not exactly the same,and even in some scenes they have obvious differences.Therefore,we use the uniform sampling method instead of the original random sampling in the scene recognition network to avoid oversampling in a ce rtain area of training data.The features of scene and object are extracted by different network,and they are fused into a new scene feature.A scene-behavior mapping table,generated by the scene recognition CNN is used to improve the detection r esults of two-stream CNN.The algorithm in this paper has obtained a result on the UCF101 dataset.The two streams CNN learn an action from both spatial space and temporal space.The spatial CNN tries to learn the action from sigal RGB images and the temporal makes use of stacked optical flow images.Beside this,the background information and action closely related objects also play an important role in humab behavior recognition.Some reseaches added the scene branch at the end of the network structure to utilze the scene information.Although the pre-feature extraction process sharing can reduce the network training time and computing resources,features learnt mainly for action recognition are not quite suitable for s cene classification.Therefore,we design and implement a multi stream CNN,including spatial CNN,optical flow,CNN and scene CNN.Experimental results on UCF101 dataset show the effectiveness of our proposed method.
Keywords/Search Tags:convolutional neural network, skin segmentation, action detection and recognition, scene recognition
PDF Full Text Request
Related items