Font Size: a A A

Research On Vision-based Indoor 3D Environmental Perception Methods For Service Robots

Posted on:2021-03-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:L WangFull Text:PDF
GTID:1368330614950850Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
With the development of the aging society,service robots,which mainly accompany the elderly in the family,have gradually been widely concerned.Such robots can help the elderly to fulfill daily needs such as objects retrieval,security,and regular medicine delivery.For these tasks,the robot should be able to run indoors for long-term,interact with people naturally,understand semantic commands from human and execute autonomously.Multiple levels of environmental perception including the metric and semantic information are required for the robot to conduct these functions.Due to the abundant information obtained by visual methods with the lower price,this thesis mainly studies the 3D environmental perception of service robots using vision-based methods,including the following four aspects:When the robot operates indoors for a long time,it needs to obtain the global pose in the map through relocalization.To solve the decline of the accuracy of traditional visual feature matching algorithms when the viewing angle or lighting changes,a visual relocalization algorithm based on the convolutional neural network using image similarity is proposed.By cropping the input image multiple times,the image with high similarity to the ones in the training dataset is selected for pose regression.The feature vector of the fully connected layer of the convolutional neural network is used to measure the similarity between the cropped images and the images in the training dataset,and the image with the highest similarity are selected.Besides,in order to solve the problem of insufficient precision of the algorithm,a visual relocalization algorithm by fusing feature method and convolutional neural network method is proposed.The visual bag of words model is used to filter the most similar images in the training dataset and the pose is calculated.The corresponding algorithm is selected according to the number of interior points.This algorithm can improve both the accuracy and robustness of the robot relocalization.People often adopt objects to describe the spatial environment and express semantic tasks.Therefore,referring to this environmental perception method,we study the 3D object detection in the environment to improve the robot's cognitive ability.Firstly,the detection algorithm based on a deep learning network is explored,and a 3D object detection algorithm utilizing a multi-channel convolutional neural network is proposed.The color image,depth and bird's eye view images are combined by the three-channel convolutional neural network to improve the perception ability of the neural network.Secondly,a 3D object detection algorithm based on multi-view fusion is researched to solve problems of continuous observation and data fusion of the robot.The algorithm utilizes real-time visual SLAM to obtain keyframes and poses and integrates multiple perspectives for incremental 3D object detection to improve accuracy.An object fusion criterion is proposed to maintain the constructed object database automatically,and object filtering is performed based on the prior size and volume ratio to remove abnormal size and intersection objects in the database.Finally,continuous operation of the algorithm on the service robot is realized,and object detection and data fusion can be performed in a wide range of indoor scenes.The robot needs to understand people's semantic information when interacting with people and working autonomously.Current maps are usually geometric metric maps,lacking the semantics for interaction.To solve this problem,a hierarchical map with semantic and metric fusion based on 3D objects is proposed for the visual semantic navigation of a robot.The map includes an object semantic map and a 2D grid map.The speech commands of people are mapped into the grid map through object semantics,thereby achieving the autonomous navigation of the robot.The 3D point cloud map is rasterized and projected to the ground plane to obtain a grid map.Then,a 3D object map is fused with a grid map to generate an object-based 3D hierarchical map,which contains object semantics and geometric metric information for the semantic navigation of the robot.The hierarchical map can be used as a human-robot interface for semantic tasks of the service robot,enabling direct interaction between humans,service robots,and the environment.There are cumulative errors when the robot utilizes a metric map for navigation,while people often obtain navigation information through vision when they move in the environment.Therefore,the visual semantic perception of the robot is studied by imitating this cognitive model.This paper establishes a visual semantic perception model based on transfer learning,which uses only visual information to realize the robot's semantic navigation task in indoor environments including multiple rooms and corridors.Visual transfer learning method is adopted to establish three models for semantic region,turning region and robot pose perception,which are employed to determine the semantic region where the robot is located,identify the position of the turning region,and the robot's relative pose.It provides the key semantic information of navigation,reduces the dependence on map accuracy,and is verified by experiments.This paper takes the execution of the service robot in an indoor environment as the background.To improve the robot's ability to sense the environment and interact with people naturally,this thesis studies the robust visual relocalization,3D object detection,hierarchical map construction and visual semantic perception which aim to establish a connection among the service robot,the environment,and human through a visual method.It provides a task-oriented human-robot interaction mode,and enhances the service robot's ability to perform semantic tasks.
Keywords/Search Tags:Indoor service robot, Visual relocalization, 3D object detection, Hierarchical map, Visual semantic perception
PDF Full Text Request
Related items