Font Size: a A A

Real Time Multi-Object Recognition And 6D Pose Estimation Method Based On Deep Learning And Information Fusion Under Complex Environment

Posted on:2023-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiangFull Text:PDF
GTID:2568306773971379Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of artificial intelligence and the intensification of population aging,more and more intelligent service robots have gradually appeared in our daily life.Accurate and real-time multi-target object recognition and pose estimation under complex indoor environment is an important premise for intelligent service robots to make decisions.In addition,object recognition and pose estimation also play an irreplaceable role in other important topics in the field of computer vision,such as virtual reality,augmented reality,automatic driving and so on.However,object recognition and pose estimation based on traditional methods are difficult to overcome the challenges of occlusion,illumination change and so on.With the vigorous development of deep learning technology,convolutional neural network has gradually become the mainstream method in the field of object recognition and pose estimation.Therefore,taking "real-time multi-target recognition and 6D pose estimation method based on deep learning and information fusion under complex environment"as the subject,this paper makes an in-depth study on multi-target object recognition and 6D pose estimation method based on convolutional neural network.The work of this paper mainly includes two parts:multi-target object recognition under complex background,object 6D pose estimation based on 2D and 3D information fusion.1.Multi-target object recognition under complex background.The primary task of robot interaction with the real world is to recognize the target object from the complex environment.As a pixel-level object classification method,semantic segmentation can not only detect the target object,but also accurately segment the object along the contour,so as to better deal with the occlusion problem.In addition,in convolutional neural networks,features of different scale not only have different receptive fields,but also usually contain complementary information.Therefore,the main innovation of the semantic segmentation network proposed in this paper is to build a multi-scale feature fusion module to improve the network’s ability to understand images,so as to improve the segmentation performance of the network.2.Object 6D pose estimation based on 2D and 3D information fusion.There are three main input types of object 6D pose estimation methods:RGB image,3D data(point cloud or depth map)and RGB-D data.Generally speaking,RGB images can provide rich texture information,but can not provide the geometric information of the object,so it is generally difficult to achieve good results in the case of serious occlusion;The method based on point cloud or depth map can provide the geometric information of the object,which is effective for object detection and pose estimation in large-scale autonomous driving scenes.However,for complex scenes such as indoor scenes,only using point cloud or depth data is not enough.Therefore,among these methods,the method based on RGB-D data usually has better effect,since it uses both texture information and geometric information.The previous network based on information fusion has some disadvantages:for example,when obtaining the geometric features of point cloud,it does not make full use of the local domain information of point cloud,and the way of feature fusion is too simple.In order to make full use of the complementary multimodal information of RGB image and point cloud,the pose estimation network proposed in this paper has the following two improvements:(1)using dynamic graph convolution EdgeConv to extract the geometric information contained in the point cloud and fully mine the domain information of the point cloud.(2)a multi-modal feature hybrid fusion method is proposed to connect the two features more effectively,so that the network can achieve accurate object 6D pose estimation under complex environment.
Keywords/Search Tags:Convolutional neural network, object recognition, semantic segmentation, object 6D pose estimation, information fusion
PDF Full Text Request
Related items