Font Size: a A A

2D-3D Image Conversion Method Base On Saliency Detection

Posted on:2021-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:K CaiFull Text:PDF
GTID:2428330620470571Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,the depth-map-based method still dominates in the 2D-3D conversion,but due to the high cost of depth capture equipment and difficulty in obtaining depth clues in some scenes,making the fully automatic 3D content generation algorithm able to be put into largescale application still remains challenge.Among the current 2D-3D image conversion methods,most of the models are only aiming at a certain state of the image,not taking both static and dynamic generation requirements into consideration.In order to solve the above problems,2D-3D image conversion methods are researched in the following two aspects:(1)In an effort to overcome the problems in device cost and depth clue extraction,a method uses visual attention analysis to convert 2D-3D content is proposed.The saliency map generated by saliency detection could show the most interesting objects or areas for human eye,in this thesis,the method makes full use of this mechanism to provide the viewer with an enjoyable 3D effect in the comfort zone of the human eyes.In the procedure of 3D content generation,combining the advantages of parallax computation,hole filling and 3D content rendering into the model based on traditional methods to improve the generalization and usability of the method.(2)The models of two deep-learning-based saliency detection are applied to meet static and dynamic requirements of 3D image generation.When performing saliency detection of still images,Fully Convolutional Network(FCN)is used to generate coarse saliency map and optimize the result by Conditional Random Field(CRF)to obtain the final static saliency map.This detection approach does not require advanced hardware and could be deployed on the intelligent device.In the process of predicting saliency map of dynamic image,the algorithm uses dilated convolution to extract multi-scale spatial feature of image,and results are fed into spatiotemporal feature extraction module,a saliency map sequence which fuse temporal and spatial information is obtained.The dynamic image detection model uses the Convolutional Long Short-term Memory(ConvLSTM)unit in the spatiotemporal feature extraction module,which is handy in processing the image sequences related to time series,so it is fairly suitable for dynamic 3D content generation.The experimental results of comparison with popular saliency detection approaches and conventional 2D-3D image conversion methods show that our method has better performance on some widely used dataset.By observing the rendering of 3D content,it can be proved that the method can generate 3D content with good experience,and has certain validity and feasibility.
Keywords/Search Tags:2D-3D image conversion, Saliency detection, Fully convolutional network, Conditional random field, Dilated convolution, Convolutional long short-term memory
PDF Full Text Request
Related items