Font Size: a A A

Research On Action Detection Methods Based On Videos Analysis

Posted on:2019-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:M W ZhangFull Text:PDF
GTID:2428330590465606Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Action detection based on video analysis aims to locate and recognize significant actions in both spatial and temporal domains of the video by utilizing computer version techniques and machine learning methods.Action detection can be widely used in many situations.In public places like airports,stations,schools etc.There is a demand to help security to reduce various risks through action detection techniques.Thus,this thesis conducts an investigation on the difficulties of action localization and recognition,including researches on the tracklets based method and the 3D object detection model based method.The specific works are as follows:Due to the fact that actions are included in pedestrian trajectories,the first part of the thesis proposes an action detection method based on tracklets.This method firstly finetunes the off-the-shelf Faster R-CNN model to detect pedestrian in frames to decide the spatial location.Then,a simple tracking-by-detection algorithm is adopted to obtain tracklets to keep temporal consistency.Next,this method applies a temporal multi-scale sliding window for each tracklet to generate the action proposal.Finally,the action proposal is further fed into a fully connected neural network for the classification task.It should be noted that features of the action proposal are obtained by the two-stream CNN.Compared with other action detection methods,experiment results reveal that this method can achieve more accurate action detection results.Given that actions belong to 3D objects,the second part of this thesis proposes a 3D object detection model,which consists of a Tubelet Proposal Network(TPN)and a Tubelet Convolution Neural Network(TCNN).Based on the 3D object detection model,this thesis further proposes the corresponding action detection method.Firstly,a video clip and anchor cuboids are fed into the TPN to obtain deep representations of each frame.Then,a regression method is applied by TPN to modify the anchors.The TPN will judge whether anchor cuboids contain actions,and outputs final action tubelet proposals.Secondly,a pair of original video and optical flow video clips,together with their corresponding action tubelet proposals,are fed into the TCNN simultaneously.After that,the spatial features and motion features of action tubelet proposals are obtained and then a fusion operation is applied to them.TCNN corrects action positions in the spatial domain of action tubelets through a regression method and estimates action categories through action tubelets.Next,this method links action tubelets by dynamic programming to achieve action localization in the temporal domain.Finally,action recognition can be achieved according to classification results of action tubelets of each action.Compared with action detection based on tracklets,this method directly generates action tubelets without tracking and achieves reliable action detection results.
Keywords/Search Tags:action detection, action recognition, object detection, CNN
PDF Full Text Request
Related items