Font Size: a A A

Research On Stereo Matching Algorithms Fusing Segmentation Clues In Deep Learning Framework

Posted on:2020-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2428330623956432Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of artificial intelligence(AI)technologies and their applications in the industry,there is an increasing demand for natural interaction between intelligent agents and their surrounding scenes.Accurate perception of the scene depth is key to intelligent interaction.For example,during autonomous driving,intelligent vehicles need depth of the scenes ahead to drive while avoiding obstacles;robots need relative distances to objects around them for path planning or interaction with the objects.Stereo matching is one of the key technologies for obtaining scene depth information.Compared with other scene depth perception methods like laser scanners,stereo matching is cheap and applicable to measuring targets at any distances.Stereo matching is a hot research topic and a key part of other topics,e.g.3D reconstruction and SLAM(simultaneous localization and mapping),in the field of Computer Vision.Although many methods have been proposed and recent deep learning technologies bring new opportunities,existing stereo matching methods still have the following problems:1.Normally,traditional energy minimization stereo matching methods design their energy function according to the assumption of colour consistency,while it is not robust enough,and prone to errors when facing the situation like scene illumination change.In addition,the traditional neighbourhood constraints used in the energy function do not perform well in areas where scene depth changes sharply.2.Existing end-to-end stereo matching networks often originate from the network structures for solving other computer vision tasks.This process often ignores the differences between stereo matching task and the others.And the large number parameters of 3D convolution used in existing networks makes it difficult to enlarge the convolution kernel size to enlarge the receptive field.3.Stereo matching algorithms based on deep learning ignores structure information of objects,and their disparity results show poor performance in the contour area.At the same time,due to the limitation of existing data sets,the performance of these algorithms is limited in the areas lacking sufficient training data.Targeting at solving the above problems,we present the following works:1.In order to overcome the shortcomings of traditional energy optimization stereo matching,we fuse colour cues with segmentation cues to design the energy function under the framework of energy optimization.This function includes convolution neural network data term and segmentation constraint term using colour and segmentation cues respectively.Then we minimize the function in MRF to obtain final disparity result.The experimental results show that our energy function has significant advantages over the traditional energy function,and can get similar accuracy to other algorithms in the same period.2.In order to overcome the problems of the existing deep learning end-to-end network does not consider the stereo matching properties,and the difficulty of obtaining large convolution kernel size for 3D convolution,we first analyse the properties of stereo matching task,then design the network structure suitable for the stereo matching task.At the same time,we propose separable 3D convolution to avoid parameter explosion caused by increasing the size of convolution kernel size.The effectiveness of our proposed structure and separable 3D convolution is verified by comparison experiments.3.In order to overcome the problem that disparity results from existing stereo matching networks are easily confused in the contour region and training data lacking region,we proposed a joint refinement network of segmentation task and stereo matching task.Our recurrent module joint these two tasks by a ConvLSTM structure.In addition,the left-right consistency features of disparity and segmentation map are extracted during the recurrent process to improve the accuracy of disparity results and semantics segmentation results.Experiments show that our refinement network achieves the joint refinement of the two tasks,and can reduce the occurrence of confusion in the object boundary area and training data lacking area.
Keywords/Search Tags:stereo matching, deep learning, convolutional neural network, semantic segmentation, disparity
PDF Full Text Request
Related items