Font Size: a A A

Manipulation Action Recognition Based On Gesture And Object

Posted on:2021-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:X J ZhouFull Text:PDF
GTID:2428330611467561Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development of modern society has promoted the Internet and Multimedia technologies to a new height,brought a lot of audio and video data to our lives.People are increasingly expecting to be able to use computers to recognize and understand human action in these images or videos.Hence,the action recognition technology emerges.As one of the cutting-edge technologies in the field of computer vision,action recognition technology has been extensively researched and explored by researchers,and has broad development prospects in multiple aplications such as smart home,video surveillance and robot learning.In view of manipulation action recognition in dynamic and complex scenes,this paper proposes an action recognition framework based on gestures and objects.The framework mainly contains an RGB video feature extraction module,a gesture feature extraction module,an object feature extraction module and an action classification module.The RGB video feature extraction module mainly uses the I3 D network to extract the temporal and spatial features of the RGB videos;the gesture feature extraction module uses the Mask R-CNN network to extract the operator's gesture features(grasp type);the object feature extraction module uses the Mask R-CNN network to extract the features(object attribute)of the manipulated object;the action classification module fuses the above features and input them into the classifier for classification.We conducted experiments on a large public action dataset named EPIC-Kitchens.The experimental results show that the framework proposed in this paper can better recognize and classify the hand manipulation actions in the videos,and verify the gesture and object features are effective for manipulation action recognition.
Keywords/Search Tags:action recognition, video feature extraction, grasp type, object attribute, I3D, Mask R-CNN
PDF Full Text Request
Related items