Font Size: a A A

Research On Video Classification Method Based On Deep Learning Network

Posted on:2021-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:S Y PengFull Text:PDF
GTID:2518306548956399Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
To solve the problem of complex recognition and low accuracy in video classification,this paper proposes a three-stream deep learning network framework that combines spatialtemporal-relational feature extraction with feature aggregation and fusion mechanism.We introduce the relational network into a two-stream convolutional neural network,focus on solving the problems such as poor stability and insufficient semantic understanding in the video feature extraction.At the same time,a feature aggregation method based on Vector of Locally Aggregated Descriptors is proposed to aggregate features,which reduces intra-class differences and realize effective utilization of the features.Furthermore,a decision-level fusion mechanism based on improved Softmax logistic regression function is proposed to adopt a three-stream feature extraction framework,which can preserve the information of the images in different subnetworks.So that the network can reflect the information contained in the video more realistic,which significantly reduces the probability of misclassification of actions by a single sub-network,and the video information can be expressed and recognized well.Finally,in addition to verifying the performance of a threestream convolutional neural network on HMDB51 and UCF-101 standard datasets,it's also proved on a daily student action video dataset,which is collected from actual campus monitoring scenarios by us.It demonstrates that the three-stream convolutional neural network has not only has an accurate classification effect on the action of complex data sets but also applies to the action classification of daily campus students,which provides a strong scientific and technological support for campus security students.
Keywords/Search Tags:video classification, temporal-spatial-relational feature extraction, three-stream deep learning framework, feature aggregation, decision-level fusion
PDF Full Text Request
Related items