Font Size: a A A

Stereo Image And Video Rectification Based On Deep Neural Networks

Posted on:2022-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2518306536488364Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the popularization of 3D technology in daily life,3D games and 3D movies have become important entertainment items in people's daily lives.However,human eyes will suffer physical discomfort such as eye fatigue,headache and so on after watching some 3D videos for a long time.The vertical disparity is one of the important factors for the viewer's visual fatigue.Therefore,3D images and videos need to undergo post-processing after shooting.The traditional stereo rectification methods apply projection transformation to the left and right views images respectively,so that the corresponding epipolar lines are on the same horizontal line.However,these methods often need to accurately estimate the fundamental matrix.The rectified results may be severely distorted and need to be cropped,resulting in loss of resolution.The traditional view synthesis method for stereo rectification is complicated,involves multiple computer vision tasks,and has poor effect on complex images.This paper proposes a view synthesis technology based on disparity estimation and image inpainting networks for stereo image rectification.An unrectified stereo image disparity estimation network is designed first to realize the horizontal and vertical disparity estimation without camera parameters.Then,based on the estimated disparity maps,the incomplete virtual right view images are synthesized through image warping.In view of the strong correlation between the incomplete images and the original left and right views,a reference views based image inpainting network is proposed,which contains a guidance module to obtain semantic information for image inpainting.According to the two new networks proposed,our view synthesis method reduces the average vertical disparity error to 0.521 px on the Movie dataset,and obtains an average score of more than 3.5 in the subjective image quality assessment,and70% of the images are better than others.For the rectification of stereo videos,this paper proposes a spatial-temporal fused stereo video frame synthesis network method for stereo video rectification,hoping to further improve the quality of the rectified stereo videos.This method keeps the left-view video frames unchanged,and synthesizes the rectified right-view video frames.A joint disparity-flow estimation network is first proposed to realize disparity estimation and optical flow estimation,so as to avoid designing two networks to increase the complexity of the overall method.After that,a simple geometric relationship transform is used to generate the geometric correspondence between the original frames and the rectified right view frame to be synthesized.Subsequently,a spatial-temporal fused stereo video frame synthesis network is proposed to fuse spatial-temporal information of input frames and to generate rectified right view frames.Finally,the PSNR of the synthesized video frames is 43.76,and the SSIM is 0.9987,which is best,and the average vertical disparity error is reduced to 0.526 px,the proportion of pixels with vertical disparity greater than 1px is only 4.13% after rectification.
Keywords/Search Tags:vertical disparity, stereo image rectification, stereo video rectification
PDF Full Text Request
Related items