Font Size: a A A

Robot Learning Manipulations Plans From Human Demonstration Videos

Posted on:2021-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:Q X ZhangFull Text:PDF
GTID:2428330611467577Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Nowadays,robots mainly perform some repetitive operations in specific industrial environments.When we need to deploy robots for a new environment,or want them to do other tasks,we have to reprogram these robots.Some researchers have come up with the idea of the learning by demonstration,which means that we ca make acomputers or robots learn new skills by watching demonstrations,rather than reprogramming them or sending them machine instructions.This paper presents an algorithm to learn the robot's manipulation plan from the demonstration video.The aim is to allow robots to fulfil more tasks simply by watching videos without having to be them reprogrammed by professionals.This can greatly reduce the cost and time of robot redeployment or task replacement,and better serve human beings in daily life.And the video of the input algorithm is unconstrained,which means the robot can search the Internet for existing video of the demonstration and automatically learn new skills.In addition,there is no restriction on the input of the demo video in this algorithm,which means that the robot can find existing demo videos on the Internet and automatically learn new skills by itself.More specifically,this article uses a triple manipulation plan expression to give a high level overview of the input human presentation video.The triple manipulation planning expression was obtained by analyzing the actions of the presenter and the manipulated objects in the demonstration video with a two-stream CNN and a Mask R-CNN.After that,the depth camera is used to scan the real environment to make sure required conditions.The robot interprets the expressions of the three operating procedures learned in the demonstration video and makes specific actions to complete the task.The result of experimenting on MPII,a public dataset containing 273 unconstrained demo videos,shows that the algorithm in this paper can accurately obtain the programming expressions of triple manipulation operations from videos,the overall accuracy is up to 73.36%.In addition,this paper integrates the proposed algorithm with a humanoid robot Baxter,so that the Baxter can learn the manipulation plan through the demonstration video and execute the learned content in the real environment,which effectively verifies the effectiveness and feasibility of this method.
Keywords/Search Tags:Learning by demonstration, Manipulations, Two-stream convolution neural networks, Mask R-CNN
PDF Full Text Request
Related items