Deep Learning Based Video Frame Interpolation Method

Posted on:2020-06-20

Degree:Master

Type:Thesis

Country:China

Candidate:Z F Zhang

Full Text:PDF

GTID:2428330623963710

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technologies,video has become an indispensable multimedia data type in daily life.People not only focus on the video content,but also pay more attention to video quality.Video frame interpolation is an important video processing technology with various applications,which gains more and more academic and industrial attention.Frame interpolation attempts to synthesize one or more intermediate frames in original video sequences.Tranditional motion compensated frame interpolation methods typically involve two steps,i.e.,motion estimation between adjacent frames and pixel synthesis guided by the motion.However,the performance of these methods relies heavily on the accuracy of motion information,which is hard to be estimated in regions with occlusion,large displacement and abrupt changes in lighting.In recent years,deep learning methods have proved its remarkable performance in many computer vision problems.Reseachers have proposed neuralnetwork-based frame interpolation methods and achieved better results than traditional algorithms,while improvement is still needed to better handle complicated video sequences.Based on an end-to-end frame interpolation model,this paper proposes two algorithms to solve the large displacement problem: encoder-decoder model and multi-scale model.These two methods receive two adjacent frames and estimate motion informantion.One or more intermediate frames are then synthesized by a volume sampling layer to form an end-to-end frame interpolation model.The encoder-decoder model utilizes the encoder module to extract the high-level motion feature with the concatenation of decoder module to predict the optical flow step by tep.Then refinement module is applied to improve the accuracy of flow by refining the discontinuous area.The other method,multi-scale model,first downsamples the input frames to different resolutions.Starting from the lowest scale,residual networks are used to estimate initial flow,which is then upsampled and input to the next scale's estimation.In the end,we can get the optical flow with the same resolution as the input frames.Besides the pixel level metrics,perceptual loss is employed in the training process to improve the visual quality of interpolation results.The proposed methods combine the two steps of traditional motion compensated methods into an end-to-end model,and no optical flow ground truth is required for reference.Experiment results demonstrate that these two proposed approaches achieve the best quantitative results than other methods.Furthermore,we notice that our interpolation methods are also able to produce visually more satisfying results without artifacts and blur,especially in the regions of large displacement.

Keywords/Search Tags:

frame interpolation, deep learning, encoder-decoder network, multi-scale model

PDF Full Text Request

Related items

1	Research On Image Semantic Segmentation Based On Encoder-decoder Structure
2	Visual Data Understanding Based On Deep Encoder-Decoder Framework
3	Research And Design Of Multi-scale Coding Fourier Ptychographic Microscopy
4	Research On Methods Of Acquiring HDR Images Based On Deep Learning
5	Research On Image Super-Resolution Algorithms Based On Deep Learning
6	Research On Video Caption Based On Deep Learning Sequence Model
7	Research And Application Of Image Semantic Segmentation Based On Encoder-decoder
8	Multi-Conditional Generation Of Personalized Texts Based On Deep Learning
9	Research And Design Of Human-Computer Conversation Model Based On Deep Learning
10	Research On OCR Technology Based On Segmentation And Encoder-decoder Architecture