Font Size: a A A

Deep Learning Based Human Action Recognition

Posted on:2019-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z X FanFull Text:PDF
GTID:2428330590467326Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Human action recognition is an important research topic in computer vision field,which has a great need in realistic applications like intelligent surveillance,video understanding,humanmachine interaction,assistant driving and so on.In this thesis,the current studies of human action recognition are summarized,and the problems of video based action recognition and pose based action recognition are further studied.For video based action recognition,aiming real-world application,a real-time online action recognition system is designed and realized.To deal with problems in real-world scenarios,the integrated system consists of modules of target detection,target tracking,optical flow improvement,action recognition and post-processing.Target detection and tracking are key procedures for real-world application.Through detection and tracking,the system is able to concentrate on target area,which eliminates interference from complex environment and ensures applicability in variable scenarios.For action recognition,an optical flow based CNN is utilized to classify actions,which takes the optical flow images of target as input and returns action class.Because optical flow is easy to be contaminated,optical flow improvement method is applied to eliminate interference caused by camera motion,which enhances expression of target motion and assists recognition.The whole system is integrated and streamlined,so that it achieves real-time performance and is capable of real-world application.For pose based action recognition,an attention-based multi-view re-observation fusion model is proposed.Attention mechanism improves performance by paying more attention to features that are more important according to contextual information.In the model,multi-layer attention method based on LSTM network is proposed,which boosts network performance by stacking multiple layers of feature attention operation in multi-layer LSTM network.Considering the importance of observation view in action recognition,a multi-view re-observation fusion method is also proposed in the model,which re-observes action from several possible views and fuses multi-view observation results to assist recognition.In view fusion,attention mechanism is applied to find suitable views for recognition based on action sequence information,which further improves fusion performance.The whole model is integrated into an end-to-end network,which achieves state-of-the-art performance on two popular 3D action recognition datasets.
Keywords/Search Tags:Action Recognition, Deep Learning, Real-World Application, Video, 3D Pose
PDF Full Text Request
Related items