Font Size: a A A

3D Object Detection Based On Monocular Images

Posted on:2019-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:B XuFull Text:PDF
GTID:2428330545986942Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of technologies in computer vision and deep learning,escibally convolutional neural neural(CNN),numerous impressive algorithms are proposed to tackle these very difficult problems in computer vision field,such as 2D object detection,tracking and person re-identification.CNN can learn more robust and general features than traditional hand-crafted ones in these scenarios.When referring to the term of object detection in compter vision field,it usually means 2D object detection from RGB images and the detection results indicate the accurate locations for each object in image coordinate system and the object class for it.However,in several real world applications like robotic application and autonomous driving,it is not enough to describe objects in the 3D real world scene with 2D detection results only.This paper addresses the problem of 3D object detection in the context of self-driving cars.The results of 3D object detection include the 3D location of each object in a specific 3D coordinate system like canera coordinate system,the length,height,width of the object,the rotation angle around each axis and the object class for it.With these results,we can get the particular location,orientation and dimension of all the detected obstacles for the self-driving car.The whole framework of the proposed method contains two parts:extraction of candidate regions and 3D parameters regression.First,convolutional neural network isadopted to extract regions of interest from the monocular images.Second,with these region candidates and the learned features inside them,the rotation angles,dimensions and object classes can be predicted directly.For 3D location,depth information is taken into consideration for accurate prediction.The depth information is directly estimated from the RGB image with a pre-trained model and combine it with the CNN features for accurate 3D location prediction,thus the final 3D detection results can be achieved directly from the monocular image.The proposed network mainly utilizes the monocular images for accurate 3D object detection in an end-to-end fashion.The proposed algorithm can outperform monocular state-of-the-art methods.
Keywords/Search Tags:3D object detection, convolution neural network, deep learning, monocular images
PDF Full Text Request
Related items