Research On Video Action Recognition And Detection Method Based On Deep Learning

Posted on:2022-10-22

Degree:Master

Type:Thesis

Country:China

Candidate:M L Yang

Full Text:PDF

GTID:2518306554470974

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Video action recognition and detection is one of the representative tasks for video understanding,and it has a broad application prospect in daily life.In the past few years,the introduction of deep learning technology has made new progress in this topic,but due to the complexity of video data and the uncertainty of human action itself,it is still difficult to build efficient recognition or detection models.This paper makes an in-depth study of action recognition and detection methods based on deep learning,and improves the shortcomings of existing methods.The main work of this paper is as follows:First,we propose an improved action recognition method from TSN.Aiming at the problem of the lack of association between each sampling group of the TSN network,a forget-gate connection module is designed.The forget-gate structure of LSTM is used to establish a feature-level connection between each group,and the sampling group is integrated from the time dimension to enhance each sample.Information transmission between packets improves the connectivity in the time dimension.Improved the feature fusion method of spatial flow and time flow,using ConvLSTM to connect the feature extraction network: superimpose the output features of the above-mentioned dual-stream network,and then use ConvLSTM to learn the long-term spatiotemporal dependence of the feature,which solves the previous dual-stream + cyclic neural network fusion The method will destroy the drawbacks of spatial features.The improved model was tested on the two data sets of UCF101 and HMDB51.The results show that the improved action recognition method has a significant improvement compared with the original TSN algorithm,and achieves a recognition accuracy rate comparable to the latest method.Second,we proposal a spatiotemporal action detection model based on fused non-local block.The model uses a double-branch convolutional neural network structure to analyze the spatial and temporal information of video.The spatial network takes the single video frame image as input to extract the appearance features of the current video frame.At the same time,the spatiotemporal network inputs the sequence of video frames to extract the spatiotemporal features of the video.Aiming at the problem that the convolutional neural network lacks the ability to understand time-domain information,the spatiotemporal network uses the threedimensional convolutional neural network that integrates non-local block to capture the global connection between video frames.In order to further enhance the context semantic information,a channel fusion mechanism is used to aggregate the features of the two branch networks,and finally the fused features are used for frame-level detection.The model is verified by experiments on UCF101-24 and JHMDB datasets,and the results show that the proposed model can fully integrate spatial and temporal dimension information,and has a high detection accuracy in video-based spatial-temporal action detection tasks.

Keywords/Search Tags:

action recognition, action detection, two-stream, non-local block, 3D convolution, ConvLSTM

PDF Full Text Request

Related items

1	Action Recognition Based On Human Skeleton Graph Convolution And Image Convolution Fusion
2	Research On Two-stream Convolutional Neural Network Fine-tuning Algorithm For Action Recognition
3	Research On Human Abnormal Action Recognition Algorithm Based On Two-stream Method
4	Human Action Recognition Based On Spatiotemporal Two Stream Convolution Network
5	Research On Action Recognition Based On Action Feature Optimization And Deep Learning
6	Temporal Action Detection Using Dense Dilated Convolutional Network
7	Research And Application Of Action Recognition Method Based On Multi Stream Depth Feature Fusion
8	Research On Human Action Recognition In Videos
9	Temporal Action Localization And Action Recognition Based On Deep Learning
10	The Research Of Video Action Recognition Method Based On Deep Learning