| Point cloud registration is an essential problem in the feld of 3D computer vision.Its objective is to estimate the rigid transformation between one or multiple point clouds so as to align their overlapped regions precisely.Efcient and robust point cloud registration can ofer technical support for various downstream tasks,such as robot navigation,high-precision map construction,and virtual reality.However,existing point cloud registration algorithms still struggle to numerous challenges when dealing with real-world scenes.For instance,in the cases of partially overlapped and noisy point clouds,current registration methods often fail to remove sufcient outliers for robust point cloud registration.Furthermore,the popular deep models mainly depend on the availability of the ground-truth rigid transformation as the supervision signal,which is difcult to obtain,increasing their training costs and hindering their applications in the real world.Moreover,when handling object pose estimation tasks,challenges such as occlusion,high noise interference,and large rotational transformations of objects also result in the unsatisfed prediction accuracy.To address these challenges,this dissertation mainly focuses on the following research works:(1)We propose a novel variational non-local network-based outlier rejection framework for robust point cloud registration in a fully-supervised manner.By reformulating the non-local feature learning with variational Bayesian inference,the Bayesian-driven long-range dependencies are modeled to aggregate discriminative geometric context information for inlier/outlier distinction.Specifcally,to achieve such Bayesian-driven contextual dependencies,each query/key/value feature in our non-local network predicts a prior feature distribution and a posterior one.Embedded with the inlier/outlier label,the posterior feature distribution is label-dependent and discriminative.Thus,pushing the prior to be close to the discriminative posterior in the training step enables the features sampled from this prior at test time to model high-quality long-range dependencies.Notably,to achieve efective posterior feature guidance,a specifc probabilistic graphical model is designed over our non-local model,which lets us derive a variational low bound as our optimization objective for model training.Finally,we propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.Extensive experiments on 3DMatch,3DLoMatch,and KITTI benchmark datasets verify the efectiveness of our proposed method.(2)The popular deep registration models typically require a large amount of groundtruth rigid transformation labels for model training,which greatly increases their training cost and hinders their applications in the real world.To address this issue,this paper proposes a reinforcement learning-based unsupervised point cloud registration framework.Its objective is to use the heuristic trial-and-error mechanism in reinforcement learning to globally search for the optimal rigid transformation between the source and target point clouds.Specifcally,by modeling the point cloud registration process as a Markov decision process(MDP),we develop a latent dynamic model of point clouds,consisting of a transformation network and an evaluation network.The transformation network aims to predict the transformed feature of the point cloud after performing a rigid transformation(i.e.,action)while the evaluation network aims to predict the alignment precision between the transformed source and target point clouds as the reward signal.Once the dynamic model of the point cloud is trained,we employ the cross-entropy method(CEM)to iteratively update the planning policy by maximizing the rewards during the point cloud registration process.Thus,the optimal transformation can be obtained via gradually narrowing the search space of the transformation.Experimental results on ModelNet40 and 7Scene benchmark datasets demonstrate that our method can yield good registration performance in an unsupervised manner.(3)To further improve the robustness of our unsupervised registration model in the cases of the partially overlapped and noisy point clouds,this paper proposes an end-to-end deep registration framework guided by a sampling network and cross-entropy evolution.This method exploits the alignment error between point cloud pairs as a proxy loss function for model training.To alleviate the local optima issue of the proxy loss function,a diferentiable cross-entropy method module is embedded into our model to drive the model to search for global optimal transformation.In addition,a hybrid reward function based on the iterative closest point algorithm is designed to better guide the evolution of the cross-entropy method.Furthermore,to address the initialization problem of the cross-entropy method module,a sampling network module is designed end-to-end to enable the model to quickly locate a promising search space and improve the efciency of the solution.Extensive experimental results on benchmark datasets including ModelNet40,7Scene,and ICL-NUIM demonstrate that the proposed method brings signifcant performance gains compared to current unsupervised deep frameworks and even some fully supervised frameworks.(4)We propose an efective center-based decoupled point cloud registration framework for robust 6D object pose estimation in real-world scenarios.Our method decouples the translation from the entire transformation by predicting the object center and estimating the rotation in a center-aware manner.This center ofset-based translation estimation is correspondence-free,freeing us from the difculty of constructing correspondences in challenging scenarios,thus improving robustness.To obtain reliable center predictions,we use a multi-view(bird’s eye view and front view)object shape description of the source-point features,with both views jointly voting for the object center.Additionally,we propose an efective shape embedding module to augment the source features,largely completing the missing shape information due to partial scanning,thus facilitating the center prediction.With the center-aligned source and model point clouds,the rotation predictor utilizes feature similarity to establish putative correspondences for SVD-based rotation estimation.In particular,we introduce a center-aware hybrid feature descriptor with a normal correction technique to extract discriminative,part-aware features for high-quality correspondence construction.Our experiments show that our method outperforms the state-of-the-art methods by a large margin on real-world datasets such as TUD-L,LINEMOD,and Occluded-LINEMOD. |