Font Size: a A A

Research And Implementation Of Intelligent Video Inpainting

Posted on:2021-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:C X ChengFull Text:PDF
GTID:2518306308470164Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of short video and Vlog,the common user's demand for video editing tools is upgrading.Video inpainting is an important function of video editing,but existing image/video inpainting tools have high threshold to use and low processing efficiency.In recent years,deep learning has made outstanding achievements in image classification,recognition,segmentation and generation.Based on deep learning,this thesis proposes a light-weight coarse-to-fine pyramid upsample image inpainting network and uses temporal shift to learn the temporal characteristics to achieve a fast and efficient video inpainting method.The main contributions of this thesis are as follows:1)based on gated convolution,optimization is carried out from three aspects.Using the pyramid sampling optimize the dilated gating convolution layers and proposing a coarse-to-fine pyramid sampling network(PUNet),compared with the gating convolution network,PUNet has less computation and more parameters to learn characteristics,as well as integrating different depth characteristics.Proposing holistic,pair-wise,pixel-wise loss function to enhance the local and global consistency.Introducing knowledge distillation into image inpainting and designs a multi-level self-distillation method.Experiments show that PUNet achieves the similar performance to gated convolutional network with 22%inference time.2)Based on the PUNet,the video temporal characteristics are integrated to improve the temporal consistency.The gated temporal shift convolution(GTSConv)is designed by combining the temporal shift operation with the gated convolution to realize spatial-temporal feature fusion.This convolution is used to replace the gated convolution,and the temporal shift PUNet(TS-PUNet)is proposed.Compared with the PUNet which only considers the spatial characteristics,TS-PUNet that learns video spatial-temporal features can achieve better performance with the same inference time without introducing extra parameters and extra operations.
Keywords/Search Tags:video inpainting, image inpainting, knowledge distillation, spatiotemporal features
PDF Full Text Request
Related items