Dynamic-to-static Translation For Visual Localization Towards Autonomous Driving

Posted on:2023-09-11

Degree:Master

Type:Thesis

Country:China

Candidate:L Wu

Full Text:PDF

GTID:2542307061453474

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

Images or videos obtained from urban environments usually contain dynamic and static areas.Dynamic areas include moving objects such as pedestrians and vehicles,and dynamic ones include fixed facilities such as buildings and roads.Dynamic-to-static translation is to convert dynamic images or videos into static,that is,to eliminate the dynamic content then restore the corresponding static background.It will significantly enhance the feature matching of landmarks in dynamic environments and play an important role in the visual localization and navigation of unmanned vehicles.With the development of deep generative technology,existing methods usually employ conditional generative adversarial networks to directly learn the mapping.However,simply treating scene translation as image translation lacks efficient use of spatiotemporal features,which easily leads to blurring artifacts in synthetic static content.Therefore,in-depth research is conducted on dynamic-to-static translation and the effectiveness of improved models are verified in rich experiments.The main contributions of this paper are as follows:(1)A coarse-to-fine translation model is proposed,aiming at the loss of original details of synthetic static images and the low quality of reconstructed static content.It transforms the dynamic-to-static translation into an image inpainting through a coarse network,a shadow detection module and a refinement network.It also introduces a novel texture-structure attention to make full use of image spatial information.Dataset is constructed with CARLA simulator.Image quality evaluation and visual place recognition evaluation are performed,respectively.Experiments show that the performance indicators of the proposed model greatly outperform the existing state-of-the-art models.In addition,the proposed model is transfered to real-world scenarios,further validating its generalization performance.(2)A multi-modal translation model is proposed to enhance the utilization of semantic and image information in dynamic scenes and extract visual localization features with high robustness.It uses a dynamic-to-static semantic segmentation network,a semantic prior probability model and a static image generation network to infer static images and static semantic segmentation.For the first time,the dynamic-to-static semantic segmentation is introduced in dynamic-to-static translation task,and based on this,an image+semantic multi-modal encoding and image retrieval approach are developed.The dataset is constructed with CARLA simulator,and semantic segmentation,image quality,and visual place recognition evaluations are performed to verify the robustness and usefulness of the proposed model.(3)A video translation model is proposed to fully exploit the spatiotemporal features of video sequences and generate spatiotemporally consistent static videos.It refers to a two-stage design from coarse-to-fine,and considers dynamic-to-static video translation as a video inpainting problem.It generates rough static images and complete dynamic region masks according to the coarse network and optical flow weighted masks.And then it efficiently extracts video spatiotemporal information through a refinement optimization network,which is embedded with a temporal shift module and feature alignment enhancement.Based on the driverless simulation dataset,video quality evaluation and visual odometry evaluation are carried out,which proves the advantages of the proposed model in maintaining the spatiotemporal consistency of synthetic videos and in visual localization.

Keywords/Search Tags:

autonomous driving, visual localization, feature matching, dynamic-to-static translation

PDF Full Text Request

Related items

1	Research On Monocular Visual-inertial Localization System For Autonomous Driving In Closed Park
2	Visual Localization Based On Prior Lidar Map
3	Research On Vehicle Multi-Feature Fusion Lidar Localization Method In Structural Driving Environment
4	Research On Key Technologies For Visual Localization Of UAV From A Convectional Perspective
5	Research On Unpiloted Localization Technology Of SLAM Based On The Fusion Of Visual And Inertial Navigation
6	Research On Localization Method For Autonomous Driving Vehicle
7	Vehicle Model Constraint Based Visual-inertial Localization Algorithm For Autonomous Driving In Closed Park
8	Research On Vehicle Localization Method Based On Semantic Map Matching
9	Research On Key Technologies Of Visual Localization For Autonomous Driving
10	Research On Obstacle Detection Algorithm For Autonomous Driving