Learning Robot Manipulation Commands From Long Demonstration Videos

Posted on:2022-08-30

Degree:Master

Type:Thesis

Country:China

Candidate:Z M Zhu

Full Text:PDF

GTID:2518306539969339

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Most of the existing robots complete specific tasks according to preset programs or instructions,which cannot meet people's personalization and customization needs,thus limiting the application and development of robots.Video command learning is an important method to empower robots.Through video commands learning,robots can understand human behavior intentions and learn skills independently without cumbersome pre-programming steps.This paper studies the commands learning for untrimmed videos and proposes a robot manipulation commands learning framework based on the action segmentation network.The demonstration video usually contains a series of manipulation actions,and the start and end moments of these actions are unknown.In response to this problem,this paper proposes a video action segmentation framework based on a multi-stage atrous pyramid network.The network mainly uses the atrous convolution pyramid module to capture multiscale action features and uses a multi-stage architecture to refine the segmentation results,thereby predicting the action class of each frame in the video,and segmenting the untrimmed video into a series of perceptible and analytical video clips.Based on the action segmentation framework,this paper proposes a demonstration videooriented robot commands learning framework,which can learn robot command sequences from untrimmed demonstration videos.The framework contains three main modules: action segmentation module,object recognition module,and commands generation module.The action segmentation module is used to segment the video into a series of video clips.In the object classification module,the object detection model is applied to extract object features,then the object features and action features are merged,and the classifier is utilized to identify the participating objects.In the commands generation module,actions and objects are combined to generate commands that can be understood and executed by the robot.Experiments on the MPII Cooking 2 dataset show that the multi-stage atrous pyramid network has achieved a certain performance improvement in various metrics of the action segmentation task.The proposed commands learning method can generate the robot command sequences from untrimmed videos with high accuracy.Finally,we successfully deployed our system on a Baxter robot to further verifying the effectiveness of our framework.

Keywords/Search Tags:

video commands learning, robot commands generation, action segmentation, atrous convolution

PDF Full Text Request

Related items

1	Research Supports Voice Commands Non-Player Character Intelligent Decision Making Model
2	Video Action Detection Based On Deep Learning
3	Research On Flash Translation Layer Algorithm For Advanced Commands
4	A Research Of The Commands Set Speech Recognition System Based On The Radial Basis Function Neural Networks
5	Research On Image Semantic Segmentation Algorithm Based On Deep Learning
6	Research On Human Semantic Segmentation Based On Deep Learning
7	Research On Semantic Segmentation Algorithm Based On Fully Convolutional Neural Network
8	Design And Implementation Of Video Player
9	Research On Semantic Segmentation Of Scene Image Based On Deep Learning
10	Research On News Text Classification Algorithm Based On Deep Learning