Font Size: a A A

Research On Binocular Stereo Vision Based On Deep Learning

Posted on:2021-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:W WangFull Text:PDF
GTID:2428330614459827Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Stereo matching,as a basic challenge in computer vision,aims at obtaining the correspondence of the pixels in the left and right images and calculating the disparity map.Researchers are devoted to exploring how to improve the accuracy and speed of stereo matching algorithms.In recent years,the rapid development of deep learning technology has achieved great success in many fields of computer vision,such as semantic segmentation,object detection,image recognition,etc.In nowadays,deep learning has been successfully applied in stereo matching.Stereo matching based on deep learning embeds the four steps of traditional stereo matching,include cost calculation,cost aggregation,disparity calculation and disparity refinement,into a convolutional neural network.On the KITTI dataset,most of the top ranking methods are based on deep learning methods.However,on the Middlebury dataset,these methods do not perform well.Because the KITTI dataset was collected in outdoor scenes,and the Middlebury dataset was collected in indoor scenes.Therefore,the academic community still lacks a large dataset for stereo matching in indoor scenes.In order to make up the lack of indoor scene stereo matching datasets,a structured light system with a dual camera-single projector was set up to collect the indoor scene stereo matching dataset.The optimized training of the network model through this dataset can improve the generalization ability of the network model in the indoor scene stereo datasets.First,the structured lighting system is calibrated.Unlike the single camera-single projector system,our structured lighting system does not require relatively complex projector calibration.It only needs to perform camera calibration and calibration of the binocular cameras,so the calibration process is relatively simple.The absolute phase is obtained by a combination of phase shift method and gray code method,and then the disparity map is obtained by matching the absolute phase,and the high accuracy subpixel disparity groundtruth is obtained by fitting a third-order polynomial.Finally,two network models(Stereo Net,PSMNet)are trained to test the validity of this dataset.Experiments show that training optimization through the dataset in this thesis can improve the accuracy of the model in the Middlebury dataset.
Keywords/Search Tags:stereo matching, 3D vision, convolutional neural network, fringe structured lighting, stereo dataset
PDF Full Text Request
Related items