Font Size: a A A

Action Recognition Based On Spatiotemporal Two-Stream Depth Network

Posted on:2020-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ZhaFull Text:PDF
GTID:2428330605450719Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
As a very important part of intelligent video surveillance,action recognition is of great significance for maintaining social order,preventing crime and national security.Although many researchers use various methods to improve the accuracy of human action recognition,the complexity and the diversity of human motion,as well as a variety of interference information in video,make feature learning and feature recognition for human action recognition still have many problems to be studied.Because of the advantages of deep learning in the field of computer vision,this dissertation uses the method of deep learning to recognize action.In this dissertation,we use spatiotemporal two-stream network to recognize human action.The network extracts the appearance information of human action by spatial network,and extracts the motion information of human action by temporal network.The following research work has been performed:Firstly,considering the fact that the SE-ResNet network can not only enhance the effective features of the current action according to the importance of the feature channel,restrain the smaller features,but also self-adaptively calibrate the feature channel to improve the expressive ability of the model to input data,we propose a model based on SE-ResNet,and classify the action using the softMax classifier and recognize the action based on the fused output of the two networks.The experimental results show that 91.9%and 63.8%accuracy are achieved in UCF-101 data set and HMDB-51 data set,respectively.Secondly,considering the strong feature expression ability of ResNet network and the fact that BN-Inception network can extract sequence motion information well and has fast convergence speed,we propose a model based on ResNet+BN-Inception where ResNet network extracts appearance information of action and BN-Inception network extracts optical flow information of action.The experimental results show that the accuracy of UCF-101 and HMDB-51 datasets is 94.5%and 70.1%respectively.
Keywords/Search Tags:deep learning, SE-ResNet, action recognition, ResNet, BN-Inception
PDF Full Text Request
Related items