Font Size: a A A

Research On Real-time Video Action Classification Based On Three-Dimensional Convolutional Neural Network

Posted on:2020-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:M L YuFull Text:PDF
GTID:2428330572976349Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the increasing popularity of network video content in people's lives,the classification technology of video content becomes more and more important.Up to now,video classification methods are mainly divided into three categories.The first one is classification methods based on image processing,the second one is using deep learning method to classify each frame of video to get the overall video classification,and the last one is the deep learning method based on the spatiotemporal information of the video.Among these methods,the first two methods do not consider the timing information of video,so there is room for improvement of the classification accuracy.The third method can be applied to the classification of video segments,but it has no ability to give real-time classification results.On the basis of summarizing the existing video processing methods and deep learning models,a real-time video action classification system with both classification accuracy and classification speed is designed in this paper.The real-time action classification system in this thesis is mainly divided into two parts,action classification module and background class judgment module.In the action classification module,this thesis first designs and implements a video action classification model which can adapt to the input of video stream by adding a buffer to the C3D model,and continuously gives the real-time classification results.After that,the accuracy of the model in short video classification task,the mAP value in long video action classification and location task,and the time required for classification after each new frame are obtained by experiments,which verify the feasibility of the model.In the background class judgment module,several video tracking algorithms are studied in this thesis,and Lucas-Kanade optical flow method is chosen after designing and comparing experiments.This method traces the sum of displacements of feature points in adjacent frames to determine whether the input segment is a background class without action,and designs it.Finally,the output of the two modules is combined to avoid the static fragments affecting the output of C3D action classification,which makes the final classification results given by the system more accurate.The innovations in this thesis are as follows:Optimizing the C3D model,replacing a single C3D model with a combination model of multiple C3D outputs for average score,improving the accuracy of action classification while guaranteeing the classification speed;adding a buffer before the C3D combination model,converting the single frame input of video stream into a batch input that C3D can accept,so that C3D can be used.Long video clips and video stream input;using optical flow method to track the position of feature points to determine whether the video clips are background classes without any action;in order to make the optical flow method to judge the threshold of motion in the background class module more accurate,this thesis designs and creates a video data set specially used for training background class on the basis of UCF101,so that optical flow method can adapt to video.The displacement of characteristic points caused by noise in the system.
Keywords/Search Tags:deep learning, video action classification, optical-flow method, three-dimensional convolution neural network
PDF Full Text Request
Related items