Font Size: a A A

Research And Application Of Target Behavior Recognition Based On Deep Learning

Posted on:2022-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y S GongFull Text:PDF
GTID:2518306557468854Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Video understanding,especially human behavior recognition,is widely used in urban security,video content retrieval and other fields.In recent years,with the improvement of hardware performance and the development of deep learning,various embedded image acquisition equipment and video surveillance equipment have been rapidly popularized in many industries,and behavior recognition based on deep learning has become a hot direction in the field of video research.Features of hand-designed are required to describe behavior in traditional methods,which is slow and inefficient.The method based on deep learning does not require manual feature,but it also has some problems in video field: the method based on 2D convolution cannot capture the timing information between consecutive frames during the convolution process;the method based on 3D convolution The method can directly model the timing information,but it is computationally intensive and difficult to deploy.The method based on skeleton or pose relies on the extraction of key points of the human body and is greatly affected by the video shooting angle and clarity.This thesis studies the efficient modeling method of temporal feature information in video.Starting from the real scene,a single-group target behavior recognition method based on the global scene and a multi-group target behavior recognition method based on local information are proposed.In the global scene,this thesis considers the encoding motion characteristics between multiple frames for 2D convolution.In the channel dimension and timing dimension,the temporal shift merges the temporal characteristics between different frames.Then,the behavior recognition network continues to be optimized and trained to eliminate background interference in the scene,good performance has been achieved in single-group target behavior recognition.In order to accurately locate human group activities and achieve the purpose of identifying multiple sets of target behaviors,this thesis proposes a multi-group target behavior recognition method based on local information.When describing the motion status of the actors in the specified scene,this method first uses the pedestrian detection network to determine the position and category of the actors in the video sequence,and uses the Mahalanobis distance and cosine distance to associate the same type of pedestrian targets between consecutive frames.The bounding box information corresponding to the target is then formed along the time axis to form multiple independent local target streams,and input it into multiple threads to perform behavior prediction at the same time,and finally get the spatio-temporal behavior detection results of multiple moving targets.The two behavior recognition methods proposed in this thesis are suitable for different scenarios.The former has fast recognition speed and is suitable for scenarios where a single type of behavior is performed.Experiments have shown that the average reasoning time of this method on the dangerous behavior dataset collected in the real scene is as low as 0.12 Seconds,Top1 increased by 6.1%.The latter can identify the behavior of multiple targets in the scene on the premise of determining the category of the moving subject.Experiments show that the average speed of pedestrian detection and moving subject tracking can reach 14 FPS.Video-map on the JHMDB and UCF101-24 datasets achieved 81.2% and 62.7% respectively.
Keywords/Search Tags:Behavior Recognition, Temporal Model, Background Removal, Spatio-Temporal Location
PDF Full Text Request
Related items