Font Size: a A A

Light Field Compression And Segmentation Based On Inter-view Similarity

Posted on:2021-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:2428330602498965Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Light field is a new data structure which describes a 3-D scene.The spatial and angular information of all the light in space will be included in just one light field image.The light field can be used for 3-D reconstruction,depth estimation,refocusing and so on.There are many games,wearable devices that give more natural experiences on 3-D simulation scenes making use of the light field.A light field image can be captured with common cameras by scanning or arrang-ing.We can also use a specific plenoptic camera instead.A plenoptic camera uses microlens array to capture a scene with different spatial and angular parameters.It con-tains 4-D information in one image.The data size of a light field image is much bigger than that of a normal image and it has a large amount of spatial redundancy.There are still many basic works to do on storage and visualization of a light field image.A light field image will be decomposed into a pseudo sequence before further works these years.A decomposed light field image has a 2-D structure that two frames adjacent in space are much more similar,and it can be converted to a linear structure by scanning,that is,a pseudo sequence.The spatial and angular structures of frames are much clear on a pseudo sequence.The spatial and angular parameters vary slightly between 2 frames.We will discuss methods to improve the performance of compression and semantic seg-mentation algorithms on a pseudo sequence by giving more reasonable descriptions on inter-view similarity.For the compression of pseudo sequences,we use the compression structure based on HEVC.We will adjust the coding order,QPs and reference frame selection scheme to refine the coding result.We take the idea of the 2-D hierarchical coding structure,which uses a hierarchical method to arrange the coding order on 2-D cases making use of the well built 1-D hierarchical B coding structure.What's more,QPs are assigned according to the spatial parameters of frames.Then,we use SIFT descriptor instead of spatial parameters to evaluate the similarity between frames and design an adaptive reference frame selection scheme based on the score.Then we test our method on the given dataset.For the semantic segmentation part,we consider a video segmentation framework for that frames change continuously according to the scanning order.As a object varies slightly between frames,we give the time-invariant feature model to evaluate the seman-tic similarity between frames.We use the structure of U-Net and siamese network,while down-sampling and max pooling will deal with local deformation,to extract the time-invariant feature and give our network structure.We use the model of time-invariant feature to build our loss function.If there is global motion between frames,our model fails.We design 2 more structures to solve the problem.A structure based on LSTM allows a slight variation on features.The other one is based on reinforcement learning that handles the global motion before we train our original network.Then we test our networks on DAVIS-2016 dataset.
Keywords/Search Tags:Light field, Compression, SIFT, Reference frame selection, Video segmentation, U-Net, Siamese network, Time-invariant feature
PDF Full Text Request
Related items