Research On Video Action Recognition Based On Deep Learning

Posted on:2021-01-04

Degree:Master

Type:Thesis

Country:China

Candidate:W Y Luo

Full Text:PDF

GTID:2428330623468344

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the development of 5G network and the Internet industry,there are massive amounts of video data producing at every moment in the fields of intelligent video surveillance,self-driving car,smart connected home,emerging micro-video and so on.How to better use computer to understand and identify action in videos to assist decision-making has become a major issue in related industries.Compared with still images,video contains not only spatial scene information,but also temporal context information,which gives greater challenges to video action recognition.Based on the popular deep learning technology,our work deeply researches and improves the current algorithms.we recognize the multi-scale problem in both spatial and temporal field in video action recognition,which means different videos may have different subject scale sizes in the spatial dimension and different action durations and frequencies in the time dimension.We combines some related processing methods in the image field and further reflections on it and poses a method which firstly decomposes the 3D convolution into separate spatial and time convolution modules,and then further expanding the two modules into a spatial-temporal multi-scale module composed of parallel multi-scale convolution kernels,which aims to extract richer scale information features.Next,we propose an attention module in the three domains of feature channel,space,and time,which aims to enhance the performance of features in important areas in the three domains so that the network can train better.In the experimental part,we integrated the two major structures proposed in this thesis into the 3D convolutional network architecture,and we do a series of comparative experiments on UCF101 video dataset.The proposed algorithm achieved excellent performance without adding optical-flow and pretraining on large dataset.Finally,we also conducted a further experiment and we find the accuracy of the motion category videos which exist more multi-scale problems in both spatial and time field is significantly improved compared with the original 3D convolutional network,which fully verifies the effectiveness of the proposed architecture.

Keywords/Search Tags:

action recognition, deep learning, 3D convolution, multi-scale, attention

PDF Full Text Request

Related items

1	Research On Human Skeleton Action Recognition Based On Graph Convolutional Network
2	Video Action Recognition Technology Research Based On Deep Learning
3	Action Recognition Based On Temporal Multi-Scale
4	Research On Deep Learning Algorithms For Human Action Recognition
5	Research On Visual Action Recognition Based On Deep Learning
6	The Research On Skeleton Action Recognition Based On Multi-scale Attention Mechanism
7	Temporal Action Detection Using Dense Dilated Convolutional Network
8	Research On Key Techniques Of Gait Recognition Based On Spatio-Temporal Multi-Scale Convolution
9	Research On Human Action Recognition In Videos Based On Deep Learning
10	An Improved Action Recognition Method With 3D Convolution Neural Network