Font Size: a A A

Monocular Image Depth Estimation And Application In 3D Reconstruction Of Forest Scene

Posted on:2022-10-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:S N ChenFull Text:PDF
GTID:1488306737474444Subject:Forest Engineering
Abstract/Summary:PDF Full Text Request
The collection of forest resource information is the basis for forest resource inventory,forestry three-dimensional(3D)visualization,intelligent forestry robots to perceive the operating environment and identify operating targets,which is also a complex and heavy task.3D reconstruction of forest scenes by visual sensors,based on which forest resource information collection,intelligent robot navigation,location and target recognition are the most promising methods,and high-precision 3D reconstruction is the key and foundation.Therefore,this paper uses monocular image as the research object to predict the depth of the image and apply it to the 3D reconstruction of the forest scene,so as to further improve the visual information processing ability of the forestry robot.In view of the different ways of obtaining images by intelligent forestry robots,this paper implements the depth estimation from three perspectives,and then recovers the 3D information of the forest scene.The main work and innovation points in this paper are summarized as follows:1.When the training dataset contains RGB images and corresponding depth map,an encoding-decoding structure model with a densely connected networks is proposed,which directly restores depth information from a single RGB image without the need for depth sensors.The encoder mainly extracts the most representative features from the original data mainly through a series of convolution operations and reduces the resolution of the input features.The decoder mainly consists of some upsampling structures that can gradually increase the resolution of the feature map.Our deep prediction model is trained from scratch,without any special fine-tuning process,and uses a new optimization function to adaptively adjust the learning rate.The experimental results show that the relative error(Abs Rel)in the NYU Depth V2 dataset is reduced by 7.69% compared with the optimal method,and the mean square error(RMSE)is reduced by 8.81%;the Abs Rel in the Make3 D dataset is reduced by 7.55% compared with the optimal method,and the RMSE is reduced by 6.26%.2.When the training dataset only contains the calibrated left-right images,a new convolutional neural network structure based on channel attention mechanism is proposed according to the binocular vision theory and the principle of epipolar geometry,which designs the neural network based on some channel attention modules,and regards depth prediction as a regression problem of disparity map.And the calibrated stereo image pairs are used to train the depth estimation model.This method does not require any depth data as a supervised signal during the training process.When evaluate the performance on the KITTI Split,our method achieved the best results on 8 evaluation metrics.On the KITTI Eigen Split,our method achieved the best results on 5 and the second best results on 2 evaluation metrics.Different from the previous related work,we evaluate the performance of depth estimation on the Cityscapes dataset also achieved the best results on 8 evaluation metrics.3.When the training dataset contains only monocular video sequences,an unsupervised depth estimation framework is designed according to the basic principle of structure from motions,and only adjacent video frames are used as supervised signals to train our neural network in an unsupervised learning method.Our method also predicts two confidence masks to address the error caused by occlusion.Finally,we use the maximum scale and minimum depth loss instead of the multiscale and average loss to improve the estimation accuracy of the model.The experimental results show that the squared relative error(Sq Rel)on the KITTI dataset is reduced by 4.86% compared to the optimal method.4.A 3D reconstruction system for forest scenes based on monocular images has been developed.The 3D reconstruction of forest scenes in three different seasonal forest enviroments by loading the unsupervised depth estimation model proposed in Chapter 4 for the measurement of the diameter at breast height of live standing trees.The experimental results show that the average error of our model is2.72 cm,which has some applicability.
Keywords/Search Tags:forestry scene, monocular image, depth estimation, neural network, 3D reconstruction
PDF Full Text Request
Related items