Font Size: a A A

Geometry-and-Texture Based Human Action Recognition

Posted on:2020-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y C LiFull Text:PDF
GTID:2428330572969970Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Human action recognition is one of the indispensable tasks for video understanding in real-world scenarios.It is widely applied in intelligent security,human-computer interac-tion,unmanned supermarket,etc.A thorough understanding of human action in videos is difficult.Most research works focus on the design of network architecture,in order to bet-ter model spatiotemporal features in video sequences.Those works ignore that there might be redundant elements in RGB representations,which are unnecessary for human action depiction and make recognition even harder.To address above issues,this thesis utilizes geometry-and-texture information to assist human action recognition.We start from the systematical analysis of essential elements for human action recognition.Then,extensive experiments verify the effectiveness of 3D geometry-and-texture based representation and we design a novel cross-modality model to guide efficient training procedure.Moreover,in light of the view-invariance behind 3D ge-ometric texture representations,we propose a novel cross-modality aggregation transfer ap-proach.Specifically,the main contributions can be summarized as follows:1.This thesis systematically analyzes human action recognition task.To address the shortage of pure RGB inputs,we analyze essential elements with 4 different mask-form human representations.The experiments are ca.rried out in 3 human activity datasets,and our research investigates many potential elements including background context,actor appea.rance,joint movement,and human shape.2.This thesis studies efficient 3D geometry-and-texture based human representations and proposes the Cross-modality Online Distillation approach.This thesis presents a new way to depict human actions with 3D geometric texture representation DensePose,which is obtained through 3D human reconstruction method.The superior perfor-mance of DcnscPosc demonstrates its ability to capture essential features.Thus,this thesis proposes a novel Cross-modality Online Distillation approach.The stream with DensePose inputs guides the RGB stream with geometry-and-texture information.This approach enables network to capture the gcomctry-and-tcxturc information from pure RGB inputs through cross-modality online distillation.Experiments show significant improvement in single-view human action recognition task.3.This thesis proposes the Cross-modality Aggregation Transfer approach.On the strength of view-invariance behind 3D geometry-and-texture based representations,this thesis proposes a new Cross-modality Aggregation Transfer approach in the field of multi-view human action recognition.It aggregates 3D geometry-and-texture information from multivicw DensePosc inpus and utilizes them to guide the learning of network with single-view RGB inputs.We achieve accurate and efficient human action recogni-tion through cross-modality multiview feature transfer and further online distillation.
Keywords/Search Tags:Deep learning, human action recognition, 3D geometry and texture, cross modality, knowledge distillation, feature transfer
PDF Full Text Request
Related items