Robot Learning Manipulations Plans From Human Demonstration Videos

Posted on:2021-03-16

Degree:Master

Type:Thesis

Country:China

Candidate:Q X Zhang

Full Text:PDF

GTID:2428330611467577

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Nowadays,robots mainly perform some repetitive operations in specific industrial environments.When we need to deploy robots for a new environment,or want them to do other tasks,we have to reprogram these robots.Some researchers have come up with the idea of the learning by demonstration,which means that we ca make acomputers or robots learn new skills by watching demonstrations,rather than reprogramming them or sending them machine instructions.This paper presents an algorithm to learn the robot's manipulation plan from the demonstration video.The aim is to allow robots to fulfil more tasks simply by watching videos without having to be them reprogrammed by professionals.This can greatly reduce the cost and time of robot redeployment or task replacement,and better serve human beings in daily life.And the video of the input algorithm is unconstrained,which means the robot can search the Internet for existing video of the demonstration and automatically learn new skills.In addition,there is no restriction on the input of the demo video in this algorithm,which means that the robot can find existing demo videos on the Internet and automatically learn new skills by itself.More specifically,this article uses a triple manipulation plan expression to give a high level overview of the input human presentation video.The triple manipulation planning expression was obtained by analyzing the actions of the presenter and the manipulated objects in the demonstration video with a two-stream CNN and a Mask R-CNN.After that,the depth camera is used to scan the real environment to make sure required conditions.The robot interprets the expressions of the three operating procedures learned in the demonstration video and makes specific actions to complete the task.The result of experimenting on MPII,a public dataset containing 273 unconstrained demo videos,shows that the algorithm in this paper can accurately obtain the programming expressions of triple manipulation operations from videos,the overall accuracy is up to 73.36%.In addition,this paper integrates the proposed algorithm with a humanoid robot Baxter,so that the Baxter can learn the manipulation plan through the demonstration video and execute the learned content in the real environment,which effectively verifies the effectiveness and feasibility of this method.

Keywords/Search Tags:

Learning by demonstration, Manipulations, Two-stream convolution neural networks, Mask R-CNN

PDF Full Text Request

Related items

1	Human Behavior Recognition Method Based On Double-stream Deep Convolution Neural Networks
2	Research For Action Recognition Based On Spatial-Temporal Stream Convolution Neural Networks
3	Research On Deep Learning Method Of Dual-channel Convolution Neural Networks
4	Classification Of Medical Images Mode Based On Deep Learning
5	Research On Improved Image Classification Methods Based On Convolution Neural Networks
6	Research On Model Optimization Of Deep Convolutional Neural Networks
7	Forensic Algorithms For Digital Image Manipulations
8	Research On Application Of Deep Learning Models For Feature Representation And Classification
9	Research On Image Big Data Classification Based On MapReduce And Convolution Neural Network
10	Vehicle Identification And Detection Based On Improved Mask R-CNN