Font Size: a A A

Optimizing Cost Volume And Aggregating Multiple-output Net In Stereo Matching

Posted on:2022-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:D C LiuFull Text:PDF
GTID:2518306608459254Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In the field of computer vision related research,stereo matching is a very classic task.It describes the process of establishing a correspondence between the paired pixels of two digital images.Stereo matching is the core of some works such as binocular stereo vision,3D model reconstruction,and target tracking and recognition.Its application scope covers many research fields,including but not limited to indoor map reconstruction,autonomous driving,and active obstacle avoidance of drones.In recent years,Stereo matching is often divided into a type of supervised learning task,which are solved by artificial neural network related technologies.However,most of the current stereo matching methods based on neural networks faced two problems.First,because most network models only rely on the powerful feature extraction capabilities of convolutional neural networks in the matching process,they ignore the importance of fusion between features of different scales,which often makes the matching algorithm focus on the effect of local matching.When facing areas such as weak textures,auxiliary scenes,etc.,the matching effect becomes poor due to the lack of context information.Second,as the stereo matching model based on the convolutional neural network becomes more and more complex,the requirements for computing power and the runtime space occupation are getting higher and higher.This article discusses and conducts in-depth research on the above two issues.On the basis of summarizing previous research and analyzing the current status of the stereo matching methods,this paper constructs an end-to-end artificial neural network model to complete the stereo matching task.The main works are as follows:(1)A stereo matching algorithm that optimizes cost volume and aggregates multiple outputs is proposed to improve the disparity calculation accuracy of common pathological areas in the stereo matching process,and to ensure that the model is stable in the face of complex and diverse environments to be matched.(2)In the process of matching cost calculation,the traditional matching cost calculation method is generally determined by calculating the gray value intensity difference between the left and right images of the corresponding pixels.It is based on the distance metric,but it only depends on this method of complex scenes.The calculation of the matching cost has a large error.In this paper,by analyzing the construction process of the cost volume model and its physical meaning,a sparse cost volume model is proposed to effectively reduce the generation of redundant calculations and optimize the memory usage.(3)The cost aggregation module of the hole convolution that aggregates multiple outputs is proposed.Use the characteristics of the hole convolution to expand the range of the receptive field as much as possible so that more feature information can participate in the process of stereo matching.In the final disparity regression stage,the disparity results of four different scales are aggregated to form the final disparity map to ensure the generalization and robustness of the model.(4)In view of the shortcomings of the disparity estimation results of some current stereo matching algorithms,such as discontinuous disparity,the loss function is redesigned and smoothing control items are introduced in the network training process based on supervised learning to achieve the purpose of smoothing the disparity.The algorithm in this paper is built using the Py Torch deep learning framework in the Ubantu system environment.The model is trained and tested on public data sets such as KITTI 2015and Sence Flow,and compared with existing stereo matching algorithms for experimental analysis.The experimental results show that the proposed algorithm can not only achieve better performance in weak matching areas,but also has a significant reduction in running time and space occupancy,which verifies the effectiveness of the proposed algorithm.
Keywords/Search Tags:binocular stereo vision, neural networks, stereo matching, disparity
PDF Full Text Request
Related items