Dynamic traffic environment reconstruction technology is aim to simulate and reconstruct the complete 3D models of various moving objects and the environment in the traffic scene.Dynamic traffic environment reconstruction is of great significance for autonomous driving,assisted driving,and efficient traffic scheduling.Precisely sensing the global position and posture of vehicles under traffic surveillance video is the core of dynamic traffic environment reconstruction technology.Althought in recent years,deep learning has brought major breakthroughs in 3D pose estimation of objects from monocular images,but most algorithms relys on the camera intrinsic parameters,difficult to apply to massive traffic cameras,accurate 3D pose estimation of vehicles using images from traffic surveillance cameras remains challenging.In view of this problem,the main research contents of this paper are as follow:(1)Propose a vehicle 3D pose estimation algorithm basesd on keypoint detection.The algorithm firstly registered the video images with the point cloud of the relevant traffic scene and a mapping relationship is established.Then,we detect the keypoints of vehicles,and the 2D pose of vehicles is obtained through the geometric relationship of the keypoints.Finally,the 2D pose can be converted into 3D pose by the mapping relationship.The pose estimation accuracy of our algorithm mainly relys on the precision of keypoint detection and is independent on camera intrinsic parameters.The algorithm can be applied to all surveillance cameras by a pre-trained model.For the difficulty of detecting keypoints resulted from the complex background of the traffic scene,we propose a keypoint detection algorithm based on multi-stage convolutional neural network.By enlarging the receptive filed,capturing the long-range dependence of keypoints and refining the keypoint detection stage by stage,the algorithm can precisely locate the keypoints,thereby ensuring the accuracy of the 3D pose estimation of vehicles.(2)Propose a lightweight vehicle keypoint detection algorithm.Although the multistage convolutional neural network ensures the accuracy of the algorithm,the speed of the algorithm is limited due to the existence of a multi-stage regression process.For this defect,we design a feature extraction network and a stage regression network both based on lightweight network structure,which greatly reduces the algorithm complexity and the number of parameters.In order to further improve the accuracy,we study the process of feature fusion between stages and propose a fast and effective method of feature fusion between stages.Through three improvements:feature extraction network,stage regression network,and stage feature fusion,the speed and accuracy of the vehicle keypoint detection algorithm are improved.The experimental results show that our proposed 3D pose estimation algorithm based on keypoint detection outperforms existing state-of-the-art algorithms in the field of vehicle 3D pose estimation.At the same time,because of our lightweight keypoint detection algorithm based on multi-stage convolutional neural network,the proposed 3D pose estimation algorithm achieves a speed of 10 fps speed under traffic surveillance video scenes. |