Font Size: a A A

Research And Implementation Of Video Action Recognition Algorithm Based On Spatiotemporal Joint Description

Posted on:2021-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:J HeFull Text:PDF
GTID:2428330614971119Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Human action recognition can effectively assist the understanding of body language,and related research has become a hot topic in the field of computer vision,which has important research significance and wide application value.At present,the mainstream acition recognition algorithms mainly focus on the efficient feature extraction and utilization of spatiotemporal information from video data,so as to improve the recognition performance of the model.This paper takes video action recognition as the research object,and proposes and implements three human action recognition algorithms based on the rich context and global relationship constraints in space and time dimensions.The main work of this paper includes:(1)A Motion Enhancement Spatial Temporal Aggregation Net(ME-STANet)is proposed.By introducing the differential TV-l1 convolution layer,ME-STANet captures the optical flow of the channel in the feature map,so as to obtain rich context representation,and then enhances the important elements in the appearance feature map with the help of the movement attention guidance module.The experimental results on UCF101 dataset show that the proposed network can improve the accuracy by 3.25%,and effectively improve the performance of the action recognition algorithm.(2)A Spatiotemporal Collaborative Action Recognition Network(StARNet)is proposed.Based on the dual flow network mechanism,StARNet can consider the representation of spatiotemporal relationship and decision fusion.At first,the proposed network uses the ME-STANet model to obtain the enhanced representation of spatiotemporal relationship.At the same time,the ME-STANet module is also introduced into the optical flow branch to further enhance the representation ability of temporal relationship.In UCF101 dataset,the accuracy of the proposed StARNet model is 5.32% higher than that of ME-STANet.The experimental results show that the introduction of optical flow branch can further improve the accuracy of action recognition.(3)A Graph Based Global Spatial Temporal Relation Reasoning Networks(GStRNet)is proposed.GStRNet projects the features of the region of interest into an interactive space through global inference of Glo Re unit,and uses the graph convolution to get its global relationship,which is then mapped back to the original space to capture the longterm dependency of the network.At the same time,GStRNet can infer the global spatiotemporal information in multiple steps and get a deeper global relationship.The experimental results on UCF101 dataset show that the proposed GStRNet model can achieve 95.8% accurate recognition rate,which is better than the ME-STANet and StARNet models and other SOTA action recognition algorithms.
Keywords/Search Tags:Residual Network, Total Variation, Optical Flow, Dual Flow Network, Global Reasoning, Action Recognition
PDF Full Text Request
Related items