Font Size: a A A

Research On Key Techniques For Real-Time Binocular Computational Stereo Vision

Posted on:2020-01-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:D L ZhaFull Text:PDF
GTID:1368330575966333Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
Vision is the most important means for humans to perceive the world.For a long time,people have been hoping that various machines can acquire the functions of the human visual system to sense the environment and achieve automation.Computer vision is a scientific field that uses computers to obtain a high level of understanding from digital images or videos.The difference between traditional digital image processing and computer vision is that computer vision wants to extract three-dimensional structures from images to achieve a comprehensive understanding of the scene.Binocular stereo vision,as an important branch of computer vision,is the three-dimensional information acquisition method closest to the human visual system.Binocular stereo vision calculates the disparity between two images acquired at different positions,thereby obtaining the three-dimensional information of the object.Binocular stereo vision with the advantages of low cost,wide use scene,and high reliability is widely used in autonomous driving,home intelligent robots,industrial automation,automatic monitoring systems and other fields.These application scenarios require accurate three-dimensional information of object in a very short time,which requires high real-time and accuracy of processing,and the power consumption and volume are expected to be reduced as much as possible.This dissertation aims at the demand of real-time,accurate and miniaturization of binocular stereo vision in general stereo vision application scenes.Focused researches were carried out respectively on the optimization of binocular stereo vision algorithm,deep extraction hardware acceleration scheme and high resolution image reconstruction based on low resolution acquisition device.The main research work and innovations of this dissertation include:(1)Considering that in the edge computing scenario,the processor needs to have high processing efficiency while maintaining low power consumption and low cost.Therefore,this dissertation proposes a novel stereo matching algorithm which balances computational complexity,memory overhead and matching precision:In order to select the appropriate stereo matching algorithm,the current local algorithm and global algorithm are deeply studied.We found that the local algorithm has low matching precision and the global algorithm has high computational complexity,which can not meet the requirements of the application scenario for real-time and accuracy.Aiming at the contradiction between the current algorithm and application scenario requirements,a fixed cross-tree algorithm based on tile is proposed.The high-resolution image is divided into multiple tiles and processed by a fixed horizontal and vertical tree structure,which solves the problem of large memory overhead of the algorithm;the two-step recognition is used to process the depth discontinuous region by segment weighting,and solved the problem of difficult processing in deep discontinuous areas.The experimental results show that the average error of the proposed algorithm in the second edition of the Middlebury standard stereo matching test set is 5.45%.The algorithm proposed in this paper performs closely to the global algorithm in matching accuracy.And the computational complexity and memory overhead are much smaller than the global algorithm.The proposed algorithm balances the computational complexity,memory overhead and matching precision with high configurability good robustness and real-time performance,which is suitable for practical applications.(2)This dissertation proposes a reconfigurable high-precision realtime stereo matching scheme:Comprehensively analyzing the current mainstream binocular stereo vision acceleration platform and the requirements for power consumption and cost of actual application scenario,we chose FPGA as the binocular stereo vision implementation platform.The parallel architecture of FPGA is used to accelerate the fixed cross-tree algorithm based on tile to realize real-time stereo matching system with reconfigurable resolution.Focusing on the high complexity of the algorithm,we parallelled the even and odd columns,and performed ping-pong operation on horizontal tree and vertical tree aggregation.A new implementation method is proposed for ping-pong operation.The ping-pong operation are performed by using one single special architecture RAM.The operation effectively improves the pipeline efficiency without increasing the memory overhead.Two pixels per cycle is realized in two parallel directions.We implemented and verified the scheme on a single Kintex-7 FPGA.The processing speed of the system is 30fps with the maximum disparity range of 60 pixels and resolution of 1920×1680 at 160MHz system clock.The MDE per seoncd is 5806.The average error on the second edition of the Middlebury Standard Stereo Matching Test Set was 5.68%.Compared with the existing implementation scheme,the proposed stereo matching implementation has the advantages of configurable,high-resolution,rich depth image detail.The verification on FPGA proves that the proposed scheme can provide real-time accurate high-resolution depth images and has a broad application prospect in complex scenes such as autonomous driving.(3)An FPGA-based super-resolution scheme is proposed to provide high-resolution input for binocular stereo vision system:Focus on the problem of low resolution of image acquisition equipment,resulting in partial loss of details,we proposed an FPGA-based super-resolution scheme reconstructs high-resolution images f-rom low-resolution acquired images using a learning-based super-resolution algorithm.The fast super-resolution algorithm based on local linear regression is improved.The Hamming distance is used instead of the Euclidean distance as the matching function,and the low-resolution image is divided into multiple tiles for parallel processing,which greatly reduces the amount of calculation of the matching function under the premise accuracy and robustness.The super-resolution system is implemented on a single Xilinx Virtex-7 FPGA.A 6-read-6-write RAM architecture using three dual-port RAM splicing is proposed to achieve multiply-accumulate of a 6×6 matrix.At a system clock of 100MHz,an output of 85fps with a scale factor of 2,3840×2160 resolution is achieved,and the processing speed reaches 700Mpixel/s,while at the super-resolution standard test set Set 5,Set 14,Kodak and BSD 100.The average structural similarity reaches 0.89,meeing the need of fast and accurate super resolution.(4)We design a stereo matching system IP:Based on the verification system on the FPGA,we implement real-time binocular stereo vision of low power consumption and lightweight with the SIMC 40nm process.The left and right view image data is collected by LVDS dual pixel mode,and the binocular correction is performed before stereo matching process.The stereo matching core calculates the disparity and the disparity is stored on the off-chip DDR3 SDRAM.The output is eventually in the form of LVDS.The total gate counts are about 17867k.
Keywords/Search Tags:Binocular computational stereo vision, stereo matching, FPGA acceleration, super resolution
PDF Full Text Request
Related items