As an important tool for information exchange and dissemination,videos are intuitive and contain a wealth of information,which are closely related to computer vision tasks and personal life.However,due to the external and internal disturbances of the camera device,the video frames will show uncontrolled oscillations,possibly accompanied by blur effect.At present,the algorithm research on video stabilization can be roughly divided into two categories.One is based on traditional algorithms,but its performance will deteriorate sharply or even fail in the cases of less texture or occlusion,large foreground moving objects and changing lighting.The other is the research based on deep convolutional neural networks,which tends to transform the shaking frames back to the stable frame images,or directly insert the stable frames to shaking video by end-to-end learning.This kind of method is lack of controllability,the performance is very easy to be extreme,and it requires a large number of shaking video and stable video pairs in the training process.However,this kind of data is very difficult to obtain in real life.In this paper,the above challenges are deeply studied,and a full-frame video de-dithering method based on optical flow and grid trajectory is proposed.The main contents are as follows:(1)Optical flow estimation for shaking video.In view of the loss of optical flow details of the existing optical flow estimation network in complex scenes,a feature pyramid network combined with dilated convolution is proposed,which uses dilated convolution to extract features containing more motion information.Explore the optical flow estimation between adjacent frames from coarse-to-fine;Aiming at the high computational cost of optical flow,warping and cost volume are introduced to reduce the computational overhead.In order to solve the problem that the optical flow error is often caused by occlusion,the positive and negative consistency check is used to detect the occlusion,identify the pixels of the occluded part,and eliminate the optical flow error caused by occlusion.Realizing the optical flow estimation in the way of unsupervised learning.The experimental results on the benchmark data set KITTI show that the performance of the optical flow estimation method in this paper is improved by about 5%.(2)Mesh trajectory extraction combined with optical flow.Firstly,this paper extracts the key-points between video frames based on the key-points detection network.For the robust selection of key-points,the optical flow is used to further verify the key-points matching,and at the same time,a robust feature motion vector field is generated.In view of the existence of multi-plane motion in the video,the use of global homography transformation for the whole video frames will cause shear artifacts and large residuals.A method of multi-plane homography transformation is proposed.A mesh trajectory extraction network based on residual motion vector is designed to recover the motion trajectories of mesh vertices from the key-points trajectories,and each sub-plane is transformed by homography.The idea of "divide and conquer" is used to alleviate the problem of shear artifacts and large residuals.The experimental results show that the key-points selected in this paper are improved in varying degrees on different error thresholds,and the performance improvement is about 7% under the strictest standard,and the mesh trajectory extraction network can select key-points in the "as similar as possible" way.(3)Video stabilization based on trajectory smoothing and Out-of-bounds View Synthesis.Aiming at the problem that the kernel weight of the existing adaptive path smoothing algorithm is fixed,which is only related to the adjacent frame time.An inter-frame grid trajectory smoothing network based on 3D convolution is proposed.The spatiotemporal information is extracted and fused,and dynamic smoothing weights are generated according to the characteristics of the trajectory.Aiming at the problem that the existing projection transformation has a cropping window,which causes the stable video frames to be too small,the Out-of-bounds View Synthesis operator is used for further fine alignment,and the positive and negative consistent optical flow is used to detect the overlapping area and obtain the boundary.Finally,the effectiveness and superiority of our research are evaluated under the benchmark datasets of multi-complex scenes using various metrics of stability,croping rate and distortion rate.The performances have been improved in varying degrees. |