Multi-Scale Convolutional Feature Fusion For 6D Pose Estimation

Posted on:2024-05-01

Degree:Master

Type:Thesis

Country:China

Candidate:Y Ren

Full Text:PDF

GTID:2568307184455524

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

6D pose estimation refers to the computation of the six degrees of freedom(DOF)pose of a rigid object,that is,the identification of the three-dimensional translation and threedimensional rotation of the object in the image in the standard reference frame.The estimation of the 6D pose of an object plays an important role for computers in the areas of augmented reality,autonomous driving,and 3D reconstruction.The development of deep learning makes the method of 6D pose estimation achieves better estimation results.In the methods based on RGB image,the two-stage methods corresponding to the two-dimensional image point and the three-dimensional model point are deeply studied by virtue of easy use and high accuracy.However,in practical application scenarios,there are usually noise factors,such as background clutter,mutual occlusion of objects,and illumination changes,which pose great challenges to accurately estimate 6D pose.Aiming at these problems,in order to improve the performance,accuracy,and stability of 6D pose estimation in complex scenes in real time,in this thesis,we investigate the 6D pose estimation methods based on the corresponding points.The main research work is as follows:Aiming at the problem of poor estimation accuracy of target objects under background clutter and illumination changes,this thesis proposes a 6D pose estimation method based on multi-scale convolution feature fusion.The encoder-decoder structure network in the semantic segmentation step uses a lightweight multiscale convolutional fusion module to replace the original convolutional layer,in order for the network to contain multiscale information and effectively enhance the network’s ability to understand features.At the same time,a convolutional layer chain with residual learning is added in the skip connection stage to eliminate the semantic differences between different layers of networks,improving the segmentation performance of the network,and subsequently improving the pose estimation accuracy.Finally,the training test is performed on the public LINEMOD dataset.The experimental results show that,compared to the original method,this method improves the2 D projection metric by 2.6% and the ADD(-S)by 1%,which fully proves the effectiveness of this experiment in the 6D pose estimation task.Aiming at the problem that the stability of the 6D pose estimation method in the occlusion scene is poor and it is difficult to achieve accurate estimation,based on the previous research on the 6D pose estimation method,a 6D pose estimation method for occluded objects with dual attention mechanism is proposed.By adding a double convolution attention module(CBAM)in the network,the network generates attention feature map information from two dimensions of channel and space,and performs adaptive feature correction with the original feature map,which effectively enhances the spatial information and location information of the feature map and improves the network learning ability.This experiment is trained on the LINEMOD dataset,and the 2D projection metric and ADD(-S)index are increased by 1.2%and 2.6% respectively compared with the original network.Experiments are performed on the Occlusion LINEMOD dataset,and the 2D projection metric and ADD(-S)index are increased by 0.4% and 0.9% respectively compared with the original network.At the same time,the ablation experiment proves that the combination of multi-scale convolution feature fusion module and dual attention mechanism can effectively improve the accuracy of pose estimation,which fully verifies the stability and accuracy of the experimental method for occlusion scenarios.

Keywords/Search Tags:

6D pose estimation, Semantic segmentation, Multiscale convolution feature fusion, Dual attention mechanism

PDF Full Text Request

Related items

1	Research On Human Posture Estimation Based On Simple Baseline Deep Convolution Neural Network
2	Semantic Segmentation Of Outdoor Scenes Based On Multi-scale Context Fusion
3	Research On Real-Time Semantic Segmentation Method Based On Feature Fusion
4	Research On Image Semantic Segmentation Algorithm Based On Convolutional Neural Network
5	Research And Application On Human Pose Estimation Based On One-Stage Detection Network
6	Research On Image Semantic Segmentation Based On Attention Mechanism And Feature Fusion
7	Research On Lightweight High-resolution Human Pose Estimation Network Based On Attention Mechanis
8	Image Semantic Segmentation Method Based On Attention Mechanism
9	Research On Image Semantic Segmentation Based On Full Convolution Neural Network
10	Image Semantic Segmentation Based On Adversarial Learning And Attention Mechanism