Font Size: a A A

Research On Digital Video Prediction Based On Deep Learning

Posted on:2022-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:L K QiuFull Text:PDF
GTID:2518306575966429Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Video prediction models are often used in motion prediction,video quality improvement,object detection model performance improvement and other fields to infer subsequent frames based on past video clips.At present,most video prediction models based on deep learning have the problems of missing local details such as intra-frame action prediction and can only clearly predict one or several frames.In this thesis,a video prediction model based on gated recurrent network and fusion of spatio-temporal features is proposed to predict future video frame sequences reasonably and stably according to the input video frames.The main research contents and contributions of this thesis are as follows:1.Two feature extractors were constructed to form a decoupling model by using the variant generative adversarial network,and the input video frames were decoupled into background information and motion features,so that the motion state of the sequence was the main focus in the prediction process,and the parameter selection in the training process of the video prediction network was optimized and the training time of the video prediction network was reduced.2.Aiming at the problem that video timing features cannot be fully learned in the process of video prediction;a gated recurrent network is applied to the video prediction network.The gated recurrent network can automatically learn the long-term time series features between data and has the advantage of reducing the network training parameters compared with the long short-term memory.The prediction network can use the learned time series features to reduce the fuzziness of the prediction frame and predict future video clips quickly.3.In order to further improve the prediction stability of the video prediction network and make the prediction model have higher use value,the gate structure of the gated recurrent network is changed from linear operation to convolution operation.In order to ensure the consistency of spatio-temporal information in the sequence and improve the prediction efficiency and training speed of the model,the continuous feature information is transmitted by using the long-term spatio-temporal dependence of the learning features of the convolution gated recurrent network to reduce the loss of temporal and spatial feature information in the process of prediction.The experimental results show that the structural similarity between the predicted sequence generated by the video prediction network based on the convolutional gated recurrent network and the real sequence reaches 0.99 on the mobile handwritten digital data set.It reaches 0.92 on KTH dataset.Compared with the prediction model based on mean error,MCNET and convolutional LSTM on KTH dataset,the results show that this prediction model can predict long-term video clips more stably,and has stronger modeling ability of spatio-temporal dependence of video and better practicability.
Keywords/Search Tags:video prediction, convolutional gated recurrent network, spatio-temporal series, deep learning
PDF Full Text Request
Related items