Font Size: a A A

Human Action Recognition Based On Spatio-temporal Information

Posted on:2021-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:H X XiaoFull Text:PDF
GTID:2428330614953808Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Human action recognition technology integrates multi-disciplinary knowledge such as computer vision,digital image processing,and human kinematics,and its research can promote the joint development of related disciplines.At the same time,human action recognition has a wide range of applications and high economic value in the fields of human-computer interaction,unmanned driving,intelligent robots,and intelligent monitoring.Due to human action recognition technology has both theoretical and practical significance,it has become one of the most active research topics in the field of computer vision.Accurate acquisition of spatial and temporal information is the key to human action recognition.The current mainstream methods have limitations,such as the inability to extract the most significant spatial information and the influence of background changes.Besides,the present action recognition method also has the problem of only focusing on 'inter-frame motion' or 'video global motion' when extracting temporal information.To solve the above problems,this paper focuses on how to extract spatio-temporal information accurately,and proposes two methods of human action recognition based on spatiotemporal information:Aiming at the problems of background interference and the inability to extract the most significant spatial information when extracting spatial information from current action recognition models,a new method for human action recognition named Moving Human Focus Inference Model was proposed.Inspired by the human primary visual cortex,which transmits information to two primary pathways,called the ventral stream and the dorsal stream.It is combining the advantages of Focus Block,CNN,and LSTM as the temporal pathway obtain long-term temporal information.The spatial pathway uses Global Max Pooling to get the most significant spatial information.Last but not least,the Bayes Inference Block at the end of the model can correct the classification results.We evaluate the proposed model on the challenging UCF101 and HMDB51 datasets;results show our method has accuracy and generalization ability.Aiming at the problem that the current action recognition method extracts temporal information,it only focuses on 'inter-frame motion' or 'global motion'.At the same time,to simplify the model parameters and reduce the requirements on experimental hardware,a model was proposed named Human Action Recognition based on Multi-scale Motion Information.The model obtains the spatiotemporal information in the video through the temporal pathway and the spatial pathway,respectively.To automatically extract multi-scale motion feature maps from video,a multi-scale motion feature extractor is used in the temporal pathway to accumulating frame differences on video images.The fusion area merges temporal and spatial information and finally sends it to the prediction area for classification.Finally,the proposed method is experimentally demonstrated on two commonly used challenging data sets UCF101 and HMDB51.The experimental results show that the human action recognition model based on multi-scale motion information can reduce the storage space and time complexity of the model while improving the accuracy of action recognition.
Keywords/Search Tags:action recognition, moving human focus, maximum posterior probabilistic inference, multi-scale motion information, spatio-temporal information
PDF Full Text Request
Related items