Font Size: a A A

Learning Spatiotemporal Features In Video For Action Recognition

Posted on:2020-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChangFull Text:PDF
GTID:2428330620456141Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the popularity of intelligent monitoring and video acquisition devices,action recognition in video has become a research hotspot in the field of computer vision due to its wide application prospect and economic value.The successful application of deep learning methods in image processing tasks has also spurred the development of video action recognition based on deep learning.Video action recognition research aims to enable the computer to autonomously identify human behavior in video through feature extraction and deep neural network learning,which can be applied to intelligent surveillance,video retrieval,human-computer interaction and other fields.Unlike image analysis,the temporal structure in the video leads to a richer intra-class and inter-class differences,which increases the difficulty of action recognition.To extract more representative temporal and spatial features,three aspects are studied including video sampling method,image feature coding and temporal feature learning.The specific research contents are as follows:1)For the current video action recognition method,the random sampling strategy in the process of video sparse sampling always leads to the missing of key information in the video,so an action recognition framework based on key frame sampling is proposed.In the training process of convolutional neural networks,the framework still adopts a random sampling strategy to ensure the diversity of feature extraction.In the test phase,the key frame sampling strategy is adopted to ensure that the neural network can learn the information in the video to the maximum extent.The video is uniformly segmented and the video frame with the largest information entropy in each segment is extract and save as the key frame.The experiment uses key frame sampling strategy in TSN network and ECO network,the improvement the accuracy of the UCF101 and HMDB51 data sets proves the effectiveness of the sampling method.2)As the simple pooling method in neural network can only pay attention to part of the action sub-class features,so an action recognition framework based on VLAD is proposed.The VLAD coding effectively aggregates the local features in the image by calculating the residual sum of the local features of the image on the cluster cluster.The framework introduces the net VLAD network structure in the image scene recognition into the video action recognition.The convolutional network extracts image features and uses VLAD coding to generate fixed-length feature vectors,which will be merged by different temporal feature fusion schemes.Finally,it is input into the classification layer to obtain prediction results.For temporal feature fusion,four schemes are studied including element-by-element addition,elementby-element maximum,element-by-element multiplication and multi-scale time series relationship fusion.The multi-scale temporal relationship fusion scheme uses multi-layer perceptron to learn temporal relationship between sampling frame sequences of different lengths,and it achieves the best performance.3)As the simple temporal fusion schemes can not fully learn the timing context in the video,an action recognition framework based on long short-term memory networks is proposed.The framework uses convolutional neural networks to extract image features.After VLAD coding,the long short-term memory networks are used to learn the temporal context in the feature vector,and finally the prediction results at all times are combined to classify the input video.The framework verifies its ability to learn timing contexts by improving the accuracy of the UCF101,HMDB51,and something-something data sets.
Keywords/Search Tags:action recognition, deep learning, key frame sampling, VLAD coding, long short-term memory networks
PDF Full Text Request
Related items