Font Size: a A A

Research On Key Technologies Of Object Six Degrees Of Freedom Pose Estimation For Complex Scenes

Posted on:2021-04-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Z LvFull Text:PDF
GTID:1488306464481944Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Target pose estimation is an important method of artificial intelligence,and it is the basis for the interaction between the machine and the external environment.It has important research significance in the fields of industrial automation,augmented reality,automatic driving and auxiliary medical treatment.As a data form widely used in industrial environments,using twodimensional images to estimate the pose of a target has important research value and application significance.However,the real scene is complex and changeable.Affected by factors such as industrial environment,texture characteristics,target occlusion,etc.,the accuracy of target pose estimation is not high.In this paper,we take advantage of the powerful feature extraction capabilities of deep convolutional networks and focus on the pose estimation problem of the target in complex environment.We research on the high-resolution extraction and pose estimation problem base on the two-dimensional image.At the same time,we combine the 3D model and depth point cloud information to optimize the pose estimation.The main research work and innovations of this paper are as follows:In view of the low efficiency of multi-target recognition in complex scenes,this paper researches on the feature extraction basis of residual coding and decoding semantic segmentation network.We analyze the relationship between the difference of target characteristics and the network model.In order to improve the parameter redundancy problem in the model,we use the dynamic convolution model to automatically adjust the number of semantic feature layers.Finally,experiments verify the advantages of this method in segmentation accuracy and simplified model.Aiming to improve the segmentation accuracy of fine-grained images,we propose a highresolution encoding-decoding network with fine-grained feature enhancement based on multiresolution context feature enhancement methods.First,according to the feature information of the target,we add a multi-view feature attention module to the connection path between the encoder and the encoder of different resolutions.Then,combining methods such as multi-scale feature weighted fusion and fine-grained attention loss function to optimize the fine-grained feature segmentation effect.Finally,experiments show that the method has strong feature enhancement capabilities and can obtain high-resolution segmentation results on fine-grained images of different tasks.To solve the problem of end-to-end implementation of the pose estimation of twodimensional images,we propose an implementation method based on the consistency of differential samples.In this paper,we propose a framework of image feature extraction based on dense features and multi-branch decoupling network regression.To solve the end-to-end implementation problem,we research the implementation process of the traditional random sample consistency algorithm,and solved the differential implementation problem by using the algorithm based on the Propose,Expand and Re-Learn(PEa RL).Experiments show that the proposed algorithm achieves the goal of stable target pose estimation.In order to further optimize the pose estimation and solve the problem of inaccurate pose estimation caused by target occlusion and unobvious features,we propose an algorithm based on explicit 3D model rendering information to complement pose estimation features.We propose a pose estimation network with two-dimensional images and discrete sets as dual inputs.To make full use of the complete information contained in the 3D model,we obtain a collection of discrete 2D images through rendering.Then,we superimpose different modal information using feature differentiation fusion,which makes up for the lack of shape and texture of monocular images.The experimental results show the effectiveness and robustness of the proposed method for occlusion target pose estimation.For the purpose of making up for the lack of depth information of two-dimensional images,we combine two-dimensional images,3D models and deep point cloud data to research the optimal expression of multi-modal data fusion.We use cross-modal information sharing to optimize feature extraction network,and fully fuse local and global features.Finally,experiments verify the accuracy and stability of this method on the pose estimation dataset.
Keywords/Search Tags:Deep learning, Refined expression, Image segmentation, Feature fusion, Pose Estimation
PDF Full Text Request
Related items