Font Size: a A A

Research On Stereo Matching Algorithm Based On Deep Learning

Posted on:2022-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZhangFull Text:PDF
GTID:2518306512471964Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Binocular stereo vision is an important depth perception technology in passive 3D visual methods.Its purpose is to recover the 3D structure of the scene from two images acquired from different viewpoints.It is widely used in industrial measurement,unmanned driving,3D scene construction and other fields.The accuracy and computational efficiency of the stereo matching algorithm are the key to determining whether the stereo vision system can be robust and real-time application.Compared with the traditional stereo matching method of artificially designed features,the stereo matching algorithm based on deep learning has achieved a significant improvement in accuracy,but it is still not ideal in ill-posed areas such as weak textures,inclined planes,and depth discontinuities.Moreover,with the increasing complexity of network structures,the requirements for memory and computing resources are getting higher and higher,and it is often difficult to deploy in practical applications.In order to achieve high-precision stereo matching under limited computing resources,this paper improves on the classic DispNet network and proposes two end-to-end disparity calculation networks.The accuracy and robustness of the network are verified through experiments.Research works in this paper are summarized as follows:(1)The basic theory of binocular stereo matching and the basic structure of convolutional neural network are studied.Since stereo matching and image segmentation are both per-pixel dense prediction tasks,in order to better apply the deep learning method to the stereo matching problem,three classic image segmentation models are studied,and the network structure and implementation principle of the benchmark model DispNet is studied.(2)A disparity estimation network based on long and short skip connections,referred to as Res-DispnetC,is proposed.The network uses an encoding-decoding structure,and realizes network structure adjustment and performance optimization by introducing a residual structure and a cost compute method based on matrix multiplication.Experimental results on SceneFlow and KITTI datasets show that ours method can quickly and accurately generate high-resolution predicted disparity maps,which is convenient for application and deployment in real scenes.(3)A disparity estimation method based on hybrid attention mechanism is proposed,called Res-DispnetC-scSE.By embedding the parallel space and channel squeezing excitation module scSE into the disparity estimation network for end-to-end training,the important space and channel information in the cost volume can be adaptively enhanced,and the learning ability of the network can be improved.Experimental results on SceneFlow,KITTI and Middlebury data sets show that the method can achieve dense and high-precision disparity prediction in real indoor and outdoor scenes,and has good generalization ability.In addition,the prediction accuracy of weak textures,small structures and other ill-posed areas is significantly improved,which fully proves the robustness and effectiveness of the proposed model.
Keywords/Search Tags:Stereo Matching, Residual Networks, Attention Mechanisms, Disparity Estimation, Deep Learning
PDF Full Text Request
Related items