Font Size: a A A

Research On Cross-domain Visual Matching For Augmented Reality In Street Scene

Posted on:2021-11-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:W Q LiuFull Text:PDF
GTID:1488306017997319Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Augmented reality is an emerging technology leading the development of the information industry.Augmented reality,which can be used for outdoor large-scale street scene,can effectively eliminate the gap between people and the information world.Augmented reality has important application prospects in many fields such as consumer electronics,military,education,and tourism.One of the core technologies of augmented reality is virtual-real registration,in which the main task is to solve the geometric transformation relationship between the camera image and the virtual 3D model obtained in real-time on the scene.The main acquisition methods of the virtual 3D model include recovering a 3D model by Structure from Motion(SfM)algorithm from an image sequence and the 3D point clouds obtained by LiDAR scanning.The Camera images,SfM 3D model rendered images,and 3D point clouds are cross-domain visual data.This article focuses on crossdomain visual matching for these three.The specific research contents and contributions are summarized as follows:(1)Cross-domain image patch matching based on Siamese networkAiming at the problem of cross-domain visual data patch matching in augmented reality,a cross-domain image patch matching based on the Siamese network is proposed.First,a large-scale cross-domain image patch dataset is constructed,which contains 100,000 pairs of matching and 100,000 pairs of non-matching camera image patches and rendered image patches.Second,based on the framework of the autoencoder embedded in the Siamese network,H-Net for cross-domain image patch matching is proposed.Third,based on HNet,H-Net++is proposed for exploring the learning feature description of cross-domain image patches.Finally,SiamAM-Net,which embeds an attention mechanism into H-Net++,is proposed to learn the feature description of cross-domain image patches.Meanwhile,an adaptive margin of margin-based contrastive loss is proposed to optimize the SiamAMNet.Experimental results show that H-Net achieves the state-of-the-art performance of cross-domain image patch matching,H-Net++demonstrates the feasibility of learning local cross-domain image feature descriptions,the local cross-domain image feature descriptions learned by SiamAM-Net are more robust,and the strategy of adaptively determining the margin in the margin-based contrastive loss function effectively improves the performance of H-Net++and SiamAM-Net.(2)Massive retrieval-oriented invariant deep feature description of dross-domain image patchAiming at the problem of cross-domain visual feature descriptions retrieval in augmented reality,a cross-domain invariant deep feature description for mass retrieval is proposed.First,the method of building a cross-domain image patch dataset was improved,and 300,000 pairs of matching and 300,000 pairs of non-matching camera image patches and rendered image patches were obtained.Second,AE-GAN-Net,which consists of two autoencoders with Generative Adversarial Network embedding,is proposed to learn crossdomain invariant deep feature descriptions.Finally,based on AE-GAN-Net,Y-Net is proposed to learn cross-domain invariant deep feature descriptions.Experiments show that the cross-domain deep feature descriptions learned by AE-GAN-Net and Y-Net are invariant.Based on the cross-domain deep feature descriptions learned by Y-Net,the camera images and rendered images are matched.(3)Deep retrieval feature description from street scene image to large-scale 3D point cloud Aiming at the problem of image to point cloud feature retrieval in augmented reality,a deep retrieval feature description of the street scene image to large-scale 3D point cloud is proposed.First,based on a high-precision mobile mapping system,an efficient method for accurately extracting an adaptive matching patch data set of images and 3D point clouds is proposed.Secondly,Siam2D3D-Net,which is optimized by a designed adaptive margin of the margin-based contrastive loss function,is proposed to jointly learn the deep retrieval feature description of image patches and 3D point cloud patches.The experiments show that the image patches collected by the proposed data set production method are consistent with the content of the 3D point cloud patches and have less redundant information;the deep retrieval feature description learned by Siam2D3D-Net can realize the retrieval of the image patch to the 3D point cloud patch.
Keywords/Search Tags:cross-domain visual, cross-domain matching, feature description, aug-mented reality, virtual-real registration
PDF Full Text Request
Related items