Font Size: a A A

Research On Stereo Matching Based On Convolutional Neural Network

Posted on:2020-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2428330599959603Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Because of the advantages of low cost,flexibility and easy implementation in acquiring depth information,binocular vision has been widely used in many advanced applications such as robot navigation,autonomous driving and augmented reality.The key of binocular vision is to obtain the disparity by finding corresponding points in two images,then the depth of the scene can be calculated using the principle of triangulation.Recently,convolutional neural networks have made breakthroughs in many computer vision tasks with their powerful capabilities of feature extraction and model representation.Stereo matching based on deep learning has gradually become a hot research topic.This thesis is based on an end-to-end convolutional neural network for stereo matching research,and we focus on the difficult regions that are prone to be mismatching such as occlusion areas,reflective surfaces and texture-less regions.For the difficult regions that existing in stereo matching,it is an important idea to use the context information of pixels to perform disparity inference.In order to make full use of context information in stereo matching,this paper proposes an end-to-end stereo matching algorithm that utilizes hierarchical context information.In the process of feature extraction,we design spatial pyramid pooling module to extract hierarchical context information at different scales and locations to accommodate different matching regions.In the stage of learning optimized matching cost,3D convolution operation of the encoderdecoder architecture is designed,and the multi-scale context information is aggregated via the skip connection.The proposed end-to-end stereo matching network directly outputs the refined disparity map without any additional post-processing.In addition,this thesis analyzes the imbalance of training samples between occlusion and non-occlusion regions based on deep learning methods.To solve this problem,a regression focal loss function is proposed for supervised training.The regression focal loss function can adaptively adjust the sample loss during the training process,suppress the loss of well-estimated samples,make model focus on the samples that are difficult to estimate and prevent the model from degrading.We evaluated our algorithm on the Middlebury 2014 dataset.The results show that the proposed stereo matching algorithm can effectively improve the disparity estimation accuracy,especially in the occlusion regions.
Keywords/Search Tags:binocular vision, stereo matching, convolutional neural network, context information, loss function
PDF Full Text Request
Related items