Font Size: a A A

VLSI Architecture Design For Convolutional Neural Network Based Binocular Stereo Matching

Posted on:2019-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z W LiFull Text:PDF
GTID:2428330542999277Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Stereo vision is widely used to reconstruct the 3D information of the scene by two-dimensional images obtained by the stereo camera.It has many applications such as in the fields of unmanned aerial vehicle navigation,autopilot and 3D scene recon-struction,Especially,the binocular stereo vision follows the human eyes processing principle,using a binocular camera to capture the images of the same scene from two different angles.Based on the principle of triangulation and disparity,the 3D geometric information of the object can be accurately calculated.Binocular stereo matching as a key step in a binocular stereo vision system,is the focus of this thesis.A VLSI architec-ture design for Convolutional Neural Network(CNN)based binocular stereo matching is proposed and verified on the FPGA in this thesis.The main contributions are as follows:(1)A VLSI architecture design for CNN based binocular stereo matching is designed.Several modules are included in the whole binocular stereo matching VLSI ar-chitecture such as convolutional neural network of several convolutional layers,semi-global matching algorithm aggregating matching costs along 5 paths,dispar-ity computation,disparity optimization and so on.According to the complexity of each module,a total of 2 clock frequencies are adopted.The fast clock domain is used to deal with the CNN to calculate the initial matching cost which complexity is high,and the slow clock domain is used to deal with the latter 3 steps in the stereo matching algorithm with lower complexity.With the strategy of time division mul-tiplexing(TDM)and multi clock domains,the hardware resource consumption of the system is reduced effectively.(2)For a specific CNN containing multiple convolutional layers,a specific pipelined VLSI architecture with all layers caculating in parallel is proposed,which has low chip cache and high throughput.This thesis extends the VLSI architecture design idea of Eyeriss to accelerate the general single convolutional layer,and implements a universal VLSI architecture using similar hardware resources with the VLSI archi-tecture for specific CNN.Under the same working frequency,experimental results show that the throughput of the VLSI architecture of extended universal single con-volutional layer is 1.58 times faster than that of the specific CNN architecture,and the required on-chip cache size and data bus bandwidth are 6.5 times and 6.3 times larger than that of the specific CNN VLSI architecture.(3)In the step of disparity refinement,strategies such as speckle filter,median filter and hole filling are adpoted.Compared with traditional speckle filter which has fixed window size,the adaptive window size of speckle filter is proposed in this thesis to further improve the matching precision.The window size of speckle filter in this thesis is different according to different disparity value of the center pixel and the location of the filter window to further improve the matching precision.Bit level comparison method is adopted in the step of 7 x 7 median filter.From the most significant bit to the least significant bit,the 49 numbers are compared bit by bit and then the median value is selected when all bits are finished comparing.It can be found that the larger window size of the median filter,the more logical resources will be saved using the bit level comparison method compared with the common fast comparision method.The average error rate of the VLSI architecture design for CNN based binocular stereo matching is 7.74%testing on the Kitti platform.The maximum working fre-quency on the VC707 FPGA development board of Xlinx company can reach 208MHZ,and the throughput is 1240 x 376/28fps,which meets the real-time requirement of the application in the embedded system.
Keywords/Search Tags:binocular stereo matching, CNN, TDM, FPGA
PDF Full Text Request
Related items