| Visual localization is to compute 6 Degree of Freedom(DoF)camera pose based on an input query image.It is an important research interest in the field of computer vision,which has been widely used in the fields of automatic driving,augmented reality and robotics.The retrieval based visual localization method is a common solution to the visual localization task.First,the images most similar to the query image are retrieved in the image database of the map,and then the local features are used to find the 2D-3D matches in the retrieved images.However,in practice,visual localization is easily affected by external factors,such as large area of sky and background,blocking of moving objects,bad lighting conditions and drastic changes in perspective.To solve the above problems,we studied and implemented a retrieval algorithm based on the attention mechanism.Furthermore,a visual localization scheme based on retrieval is proposed for the WebAR navigation scenario,and the practicality of the algorithm is verified.After all,the main work of our paper is as follows:(1)In view of the occlusion of large areas of sky and background as well as moving objects in the actual scene,we introduce the channel attention mechanism and the space attention mechanism into the feature retrieval algorithm based on deep learning to make the algorithm model more sensitive to areas and features related to the localization tasks while mitigating interference from large-area backgrounds,pedestrians,vehicles and other occluding objects.The recall rate is used as the evaluation metric on the SVOX dataset,Pitts30k dataset,and Tokyo 24/7 dataset,and the result shows that,compared to recent retrieval algorithms such as NetVLAD,CRN,SuperPoint,and Al-VLAD,the proposed algorithm achieves higher recall rates under the same conditions.The heat map is used to visualize the feature attention,and the experiment proves that the proposed algorithm is more robust to interference from large-area backgrounds and moving obstacles.(2)We propose a parallel retrieval method based on the complementary characteristics of SIFT and CNN features to address the issue of insufficient robustness of the retrieval method that uses CNN features alone in scenarios such as drastic changes in lighting.Taking advantage of the known true values of positive sample images of actual database images in WebAR navigation scenes,a reranking method is designed to combine feature retrieval algorithms based on SIFT features and CNN features to improve the accuracy and stability of the retrieval algorithm.The complementarity between SIFT and CNN features was verified on the Baidu Mall and Virtual Gallery datasets,and the performance of parallel retrieval method was evaluated through precision rate and recall rate.The experimental results show that the parallel retrieval algorithm has advantages over the CNN feature-based and SIFT feature-based retrieval algorithms in terms of accuracy and robustness.(3)Based on the above two algorithms,a WebAR navigation system is designed and implemented in this paper.The two algorithms proposed in this paper are organically integrated into the visual positioning steps of the WebAR navigation system.And through the Cinatra framework,the complete visual localization and navigation process is encapsulated into an API interface to provide navigation services to the front end.In practical application,we verified the practical value of the two algorithms proposed in this paper and provided new ideas for the visual localization problem of WebAR navigation system in real scenes. |