Font Size: a A A

Research On Binocular Stereo Matching Algorithm Based On Convolutional Neural Network

Posted on:2021-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:2428330614470745Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
As a passive depth perception technology,stereo matching is one of the key problems in the field of computer vision.It perceives the depth information of the scene by simulating the human binocular system,and has the characteristics of simple configuration and high accuracy,so it has been widely used in emerging technologies such as autonomous driving and drone navigation.The core task of stereo matching is to determine the horizontal position difference(disparity)between the corresponding pixels in the binocular image,and thus obtain the depth information of the object point.Under this background,this thesis combines the stereo matching task with the convolutional neural network to designs and optimizes the algorithms for the scenarios where the device capacity is limited or the high precision is required.The main work of this thesis is as follows:1.For mobile terminals or embedded devices with limited capacity,this thesis proposes a stereo matching network based on multi-scale information extraction.The network uses the spatial pyramid module to fuse the feature information of different receptive fields to enhance the feature extraction capability of complex regions.Then,this thesis improves the calculation method of similarity,and proposes the two-dimensional hourglass network to repeatedly aggregate the information of the joint features to calculate similarity at each disparity level.Finally,this thesis proposes a lightweight disparity refinement sub-network based on attention mechanism,which optimizes the initial disparity map from the channel and spatial dimensions under the constraints of left-right consistency.2.For the scenes with high precision requirements,this thesis proposes a cascaded stereo matching network based on multi-information cost volume.The network first generates cascaded features of three scales,then uses independent branch networks to aggregate cost volume information at each scale,and finally gradually merges the feature information of each scale to achieve the refinement of the disparity map.In the feature fusion stage,this thesis proposes a dilated feature fusion unit to repair the upsampled feature map from a larger receptive field range.In the stage of cost volume construction,this thesis proposes a multi information cost volume based on grouping strategy,which computes the correlation cost volume by feature grouping,and introduces channel attention mechanism to adaptively adjust the relationship between the correlation cost volume and the concatenation cost volume.In order to verify the effectiveness of the proposed algorithm,this thesis conducted comparative experiments on the three benchmark datasets of Sceneflow,KITTI 2012 and KITTI 2015.Among them,the stereo matching network based on multi-scale information extraction under the condition that the parameter amount is only 2.4 trillion,the matching accuracy can still be equal to most algorithms.The cascaded stereo matching network based on multi-information cost volumes can achieve better matching accuracy.The experimental results fully prove the effectiveness of the algorithm in this thesis,and show that it has certain practical application value.
Keywords/Search Tags:Stereo matching, Convolutional neural network, Disparity estimation, Multi-scale information fusion, Attention mechanism
PDF Full Text Request
Related items