Font Size: a A A

Research On Binocular Stereo Matching Algorithm In Complex Scenes

Posted on:2022-02-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:H W SangFull Text:PDF
GTID:1488306731968579Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The goal of computer stereo vision is to re-construct three-dimensional scene information from two-dimensional image data,which is an important research direction in the field of computer vision.Computational vision technology has been widely used in the fieled of aerospace,autonomous driving,logistics,robot navigation,AR/VR and other fields.Stereo matching is one of the most potential and anthropomorphic method in stereo vision,it aims to find the corresponding points on the two stereo images with epipolar rectification and then to calculate the disparity value of each pixel in the two input images to obtain depth information.Stereo matching is a reverse resolution that aims to acquire the three-dimensional spatial information from the two-dimensional two-dimensional stereo images.Since the two-dimensional information is transformed from the real three-dimensional objects,it is easy to lose part of the information and bring in ill-condition uncertaintly during the conversion process.Besides,it makes more challenging for existing stereo matching algorithms to obtain more accurate three-dimensional information because of occlusion,low-texture areas,repeating patterns,transparent surfaces,reflections,non-diffusion,perspective shortening,optical geometric changes,etc.Moreover,the new dataset can easily introduce complex scenes such as exposure and lighting changes to the stereo images,making it more challenging for existing matching algorithms to further improve accuracy and speed.This paper takes the algorithm into the four steps: matching cost calculation,matching cost aggregation,disparity calculation and disparity refinement and takes it as the starting point.This paper will further research the classic algorithms on different datasets aims to provide new ideas and algorithms for better solving the existing problems in stereo matching.In the stereo matching cost stage,the robustness of matching cost needs to be further improved due to illumination change,noise,lack of depth information and complex scenes.In the matching cost aggregation stage,there are problems such as the accumulation of incorrect matching costs,slow convergence,and the loss of detailed information.In the disparity calculation stage,there are problems such as the uncertainty of the initial disparity plane label,the ineffective propagation of the grid edge disparity label,and the single encoder-decoder structure performs poorly in CNN.In the disparity refinement stage,the left-right consistency check also has certain limitations and other problems,which need to be further resolved.Aiming to solve the above-mentioned problems,the main research work and results of this paper are as follows:Firstly,non-local stereo matching MST can achieve better results on the Middlebury V2 stereo matching dataset,however,it achieves high mismatch rate in areas such as weak textures.To solve this problem,this paper uses gradient information of logarithmic transformation and fusion of initial disparity to calculate matching cost.For the limitation of left-right consistency problem,this paper introduces a new method that using minimum and sub-minimum values to complete the left-right consistency to further improve the accuracy of prediction.Based on these,this paper introduce an improved Census algorithm based on TSGO to relief the mis-matching costs and slow convergence,which can be easily caused by high rate of illumination changes.Besides.the new proposed method can relief the high interface mismatch rate which can be easily found in the fusion of non-local stereo matching and global stereo matching method.Moreover,an adaptive weight method which is based on the color and initial disparity value information is proposed to achieve adaptive adjustment of the transmission energy,which can improve the accumulation of mismatching costs and reduce the matching error rate significantly.Secondly,traditional local and global stereo matching on the Middlebury V3 dataset(added lighting changes and exposure conditions)can easily lead to problem such as not ideal matching cost effect.Local Exp algorithm based on the initial matching cost of CNN has achieved good results,However,the RANSAC method in this algorithm has problems such as the randomness of the initialized disparity plane and the ineffective propagation of the grid edge disparity label.In order to improve these problems,this paper uses a high-confidence pixel selection method based on constraint conditions to get a better initial disparity plane,which effectively improves the problems of high mismatch rate and slow convergence caused by the uncertainty of the disparity plane.In order to improve the limitation of the propagation of disparity labels at the edge of the grid,this paper introduces a new method that uses a collaborative optimization mechanism between adjacent pixels.The proposed algorithm is verified on the Middlebury V3 online test dataset,and the error is less than 1 pixel on all pixels to reach the first place on the website ranking.Thirdly,for the non-end-to-end stereo matching network,the KITTI stereo matching dataset(more complex road and other scene datasets)cannot effectively learn the unique attributes of the scene,this can easily lead to a higher error rate.The end-to-end network has made great progress in recent years,but it has problems such as high mismatch rate in ill-posed regions such as diffuse reflection region,large weak texture region,repeated texture region,and loss of detailed information in the down-sampling process.To improve the stereo matching network's performance,a dual-channel attention hourglass sub-network is proposed to extract feature maps to recover detailed information,which improves the loss of detailed information in the down-sampling process.A u-shaped subnetwork based on pixel attention is proposed to realize cost aggregation process and improve the problem of dense disparity loss in multi-scale.The proposed method's performance will be further verify on the KITTI stereo dataset.Finally,the end-to-end stereo matching network can be divided into four steps: left and right feature extraction,cost volume construction,regularize cost volume and disparity map prediction.In the feature extraction stage,multi-scale information cannot be effectively integrated and the feature screening cannot effectively select useful information,etc.This paper introduces a new algorithm that uses channel attention and multi-scale sub-network modules to extract features more efficiently.For the cost volume construction stage,the cost aggregation construction is single in traditional method,and now it is proposed to calculate the absolute value difference between the left and right features in the disparity dimension.For problems such as redundant information introduced by stacking encoderdecoder struct in the cost regularization stage,a gate attention module is proposed to improve the efficiency of stereo matching network.For the unsmooth disparity map in the prediction stage,the Soft-Argmin method is used to obtain the more robust and smooth sub-pixel disparity value.Experiments on KITTI stereo dataset have verified the proposed algorithm and achieved good results.The above research has proposed algorithms such as MST-GD?TSGO-CD?Local Exp-RC?PASNet?MPAnet and completed the validity/precision verification of the Middlebury and KITTI stereo datasets.In all,this paper provides some effective algorithms to improve the existing problems in current stereo matching,and lays an algorithm foundation for further research on high-precision stereo matching algorithm.
Keywords/Search Tags:Stereo matching, Cost Aggregation, high-confidence, collaborative optimization, Pixel Attention, Multi-path attentive module
PDF Full Text Request
Related items