Font Size: a A A

Research On Hierarchical Action Recognition Based On Attention Guidance

Posted on:2020-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:W W LiuFull Text:PDF
GTID:2428330620460028Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Video content analysis is an important task for human visual understanding.As the basic research of video action detection,relational reasoning and content understanding,Video human action recognition has always been a research hotspot in the field of computer vision.In the early stage,it can be divided into global features and local feature methods according to different extraction feature ranges.In recent years,it can be divided into manual feature methods and deep learning feature methods before and after extensive application of deep learning in the computer visual field.Although the action recognition methods have changed greatly,how to strengthen the learning of temporal and spatial characteristics in video has always been the focus of action recognition research.By summarizing and analyzing the influence of different parts of human body on action recognition,this paper proposes a two-stage behavior recognition framework that enhances global feature expression for coarse-grained action classification and strengthens local features of behavioral people for fine-grained action classification.Compared to the previous single-stage action recognition algorithm,our targeted trained fine-grained classifiers,extracting body-part area motion features,separate the types of actions that are easily confused;while preserving the coarse-grained classifier for different actions which possess great difference such as shooting and pulling up.We use the best aggregation features for each stage of the classifier to characterize the video action.Finally,we achieve more accurate action recognition performance by combining with coarse-grained and fine-grained classifier results.Based on the above research,we find that action recognition depends not only on the characteristics of the body parts of the actors,but also on the spatially discriminative areas of the scenes in which the actors interact also play an important role,such as the horizontal bars in pull-up action and stairs when climbing building stairs.For this reason,we propose to extract the context information of dynamic scenes and discriminative scene patches,extracting discriminative spatial features between different actions;at the same time,we utilize 3D deep convolution neural networks for strengthening the temporal domain feature representation rather than maximum/minimize pooling operation of frames features.Experiments show that discriminative scene information helps to improve the performance of the action recognition algorithm.
Keywords/Search Tags:Action Recognition, Discriminative region, Deep Learning, 3D Convolutional Neural Network, Hierarchical framework
PDF Full Text Request
Related items