Font Size: a A A

Research On Prediction Of Manipulator Interaction Based On Deep Learning

Posted on:2021-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:J W ShiFull Text:PDF
GTID:2428330602486045Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Predictive reasoning is an important ability for robots to imitate human intelligence.In recent years,video prediction technology in the field of computer vision has provided the methods for robots to implement predictive coding,so that robots can predict a reasonable future scene through image generation based on a few continuous video frames and other information.Then robots can act like a human to use their own prediction capability to plan and complete some operational tasks by themselves through the environment,which greatly improves the intelligence of robots.At present,there are some technical challenges to achieve these goals,including how to construct a video prediction model that can obtain accurate and realistic predictions,how to evaluate the results generated by the prediction and select an excellent model that the robot can use,and how robots implement self-supervised motion planning based on video prediction models.This thesis mainly focuses on the above three points to carry out the following research work:1)Video prediction research for the manipulator interaction scene in the physical environment is carried out.An unsupervised learning video prediction model training structure is constructed which combines a variational autoencoder and a generative adversarial network.In addition,the image prediction generator is a LSTM-CNN structure.An experiment platform based on the UR5 manipulator is built and the collection of interactive trajectory dataset is completed for training to get video prediction models.The experiment results show that the model can obtain clearer and more accurate predictions by compositing the dynamic foreground pixel conversion and the static background.2)An image quality assessment metric based on human visual perception is proposed.Different video prediction models often cause different types and degrees of distortion,thereby reducing the prediction-oriented operability of robots.The image quality assessment of the video prediction task must consider whether it is consistent with human visual perception to judge the rationality,and not just get some shallow information such as PSNR or SSIM method.Based on the two alternative forced choice(2AFC)experiment of human visual perception,this thesis proposes a perceptual assessment metric based on convolutional feature extraction networks for video prediction task.The assessment results are consistent with the law that the video prediction quality declines over time,and more correspond to the human perception.In addition,the VggNet used as feature extraction network is more sensitive to image quality,so it will be easier to assess similar images.3)A self-supervised planning algorithm based on video prediction is proposed and implemented.The video prediction module is used as the core,and by predicting the artificially specified task pixel distribution,the target distance loss function is optimized to select the best sampled action for plan and execution.The experiment results prove the effectiveness of the self-supervised planning algorithm,which can realize the push and rotation tasks by the manipulator.Meanwhile,the experiments verify the effectiveness of the video prediction model.
Keywords/Search Tags:Video prediction, Image quality assessment, Unsupervised deep learning, Self-supervised visual planning, Manipulator interaction
PDF Full Text Request
Related items