Video Scene Prediction Based On Deep Learning

Posted on:2022-06-16

Degree:Master

Type:Thesis

Country:China

Candidate:Z X Li

Full Text:PDF

GTID:2518306353984509

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the improvement of computing power,the ability of computers in vision tasks has been improved,and many applications in the field of computer vision have made great breakthroughs.Among various tasks oriented to computer vision,video prediction tasks have received widespread attention in recent years because they do not require human annotation data and there are a large amount of video data available in life.Video prediction aims to allow the model to automatically generate future image frames by learning a series of previous frames.However,compared with images,the video includes not only spatial dependence but also temporal dependence,which makes video prediction tasks extremely challenging.At this stage,although the research on video prediction has gradually shifted from the method of focusing on pixel rules to the method of focusing on motion information,most research results,especially for long-term video prediction,usually still generate unclear future frames,blurry images,and lack of local details in future frames,especially for long-term video prediction.Aiming at the task of video prediction,the paper first studies and analyzes the current mainstream prediction algorithms.After that,the paper discusses a Spatio-temporal feature decomposition video prediction network,MCnet,and explores its related prediction principles and some existing problems.Based on the MCnet network,This paper proposes an improved video prediction algorithm based on deep Spatio-temporal features,and the main contributions are as follows: 1)Aiming at the problems of gradient disappearance,gradient explosion and mode collapse in traditional generative adversarial networks,the paper introduces a new GAN framework WGAN-gp to solve the problems of traditional GAN and improve the convergence speed of the network.2)Aiming at the problem of insufficient motion prediction ability of MCnet network,this paper proposes a motion decoder,which strengthens the motion encoder's ability to predict motion by introducing motion loss.3)In order to further strengthen the prediction ability of the prediction algorithm for motion,the paper proposes frame difference loss,by calculating the frame difference between the predicted frames and the frame difference between the real future frames,to strengthen the entire network's ability to encode and predict motion.4)Considering the problem of MCnet's poor ability to predict image edges and details,and the problem of fuzzy prediction of MSE loss,this paper proposes feature loss,and adds an edge extraction network HED network,which uses the edge features and shallow content features inside the HED network as feature loss to strengthen the entire network's ability to predict image edges and details and improve the quality of prediction.Finally,the paper evaluates the improved algorithm and current typical video prediction algorithms on the KTH data set,UCF101 data set and KITTI data set,and uses PSNR and SSIM as the evaluation criteria.The experimental results show that the video prediction algorithm based on deep Spatio-temporal features is better than the existing prediction algorithms on each data set.Compared with other algorithms,the prediction algorithm in this paper has more advantages in predicting ability,especially for motion prediction.In addition,in terms of imaging quality,the algorithm in this paper has also been greatly improved,with better performance.

Keywords/Search Tags:

deep learning, video prediction, generative adversarial network, spatiotemporal features, compound loss

PDF Full Text Request

Related items

1	Video Prediction Based On 3D Convolution Neural Network
2	Research On Single-Channel Speech Enhancement Based On Generative Adversarial Network
3	Research On Image Super-resolution Based Or Generative Adversarial Network
4	Research And Implementation Of Video Generation System Based On Generative Adversarial Network
5	A Research On Video Repair Algorithms Based On Deep Learning
6	Design Of Video Super Resolution Reconstruction Algorithm Based On Generative Adversarial Network
7	Research And Application Of Image Recognition Method Based On Deep Generative Adversarial Networks
8	Super-resolution Research Based On Generative Adversarial Network
9	Deep Convolutional Video Representation Learning
10	Research On Video Anomaly Detection Based On Frame Prediction