Font Size: a A A

Research On Video Classification Based On 3D Reversible Network

Posted on:2021-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z K LinFull Text:PDF
GTID:2428330623967764Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,the scale of images and videos on the Internet is becoming larger and larger,and hundreds of hours of video are generated every minute on major video websites.Large-scale video classification become the next critical problem to be solved after the image classification problem.In recent years,traditional video classification algorithms have been restricted,and with the continuous deep learning and expansion of the network,deep networks(such as Reversible Residual Network,Res-Net,Google-Net,C3 D,Two-stream,etc.)have greatly promoted images and videos With the development of the classification technology,the performance has been continuously improved.However,due to the need to save the activation before calculating the back-propagation gradient,there is a bottleneck in storage consumption under limited resources.This phenomenon is more obvious in 3D networks,which severely limits the depth and width of the network.This thesis proposes a 3D reversible network3D-RevNet based on the idea of 2D-REVNET and is used for video classification.3D-RevNet is a variant of 3D-ResNet.It also does not need to save the active layer output of the 3D residual block,which effectively improves the memory utilization rate.In this paper,we discuss the different ways of cutting input and output data according to traditional channel cutting and video frame-based cutting.Due to the segmentation of the 3D-RevNet residual block,the method based on video frame segmentation captures local frame features.At the same time,the global frame information association before and after the video stream is mined to improve the video classification accuracy.A large number of experiments conducted on standard data sets(such as Kinetics,UCF-101,etc.)show that compared with existing methods,3D-RevNet uses the Image-Net trained model to significantly improve the memory usage and the accuracy of video classification.
Keywords/Search Tags:Deep learning, Action recognition, 3D network, Memory consumption, Resource optimization
PDF Full Text Request
Related items