Font Size: a A A

A Deep Action Recognition Framework With Discriminability And Temporal Characteristics

Posted on:2021-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:H BaiFull Text:PDF
GTID:2518306047485944Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Human action recognition integrates the research contents of many fields such as image processing,machine learning,pattern recognition,etc.,and is a hotspot and difficulty in the field of computer vision.In recent years,more and more scholars and institutions have committed themselves to the research fever of this field.With the continuous deepening of research contents,the research focus has gradually developed from simple action recognition to the analysis,understanding and recognition for complex action.However,complex action recognition is a challenging research task,and the influence factors contains unconstrained complex environment,background clutter,and viewpoint change,etc.Therefore,it is very important to extract effective and discriminative feature representation for improving recognition performance.Based on the analysis and summary of current action recognition methods,the following contributions are presented:Firstly,an adaptive sub-action video segmentation method is proposed,which adaptively divides an action video into several different atomic action segments by calculating the similarity between the deep features of each pair of adjacent frames.Compared with the equal segmentation,this method does not break the coherent action pattern of specific intention for each action,and preserves the temporal continuity of motion;then,a temporal pooling method is introduced to aggregate the features of each atomic action,so as to obtain the temporal dynamics in atomic action;subsequently,a multi-scale temporal evolution descriptor is constructed,which,by designing a multi-time scale fusion objective function,enables the optimized video feature representation to describe the action more comprehensively and precisely in different temporal resolutions.Secondly,a cross-enhanced attention network is constructed.By introducing the hierarchical complementary attention sub-module and local enhanced attention sub-module,a discriminative deep static descriptor is obtained;among them,a divergence constraint and a hierarchical complementary constraint are introduced into the hierarchical complementary attention sub-module,so as to automatically focus on the salient regions related to object in frames;then,in the local enhanced attention sub-module,a semantic discriminative constraint is presented to model the semantic dependency between different channels for further emphasize the most representative local detail information in the feature,and meanwhile suppressing the irrelevant noise.Consequently,the expression ability of feature is improved.Thirdly,the multi-scale temporal evolution descriptor and discriminative deep static descriptor are combined to further establish a novel framework for action recognition.It integrates dynamic and static information into a unified framework,which realizes a more comprehensive and accurate description for action,and improves the recognition performance.Experiments on UCF101 and HMDB51 datasets verify the effectiveness of the proposed methods.Finally,this thesis further analyzes and summarizes the research content,and discusses the future work for action recognition.
Keywords/Search Tags:Action recognition, Adaptive action segmentation, Multi-temporal scale, Attention mechanism
PDF Full Text Request
Related items