Research On Video Human Action Recognition Based On Deep Learning

Posted on:2021-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:Y Wang

Full Text:PDF

GTID:2428330614955521

Subject:Control engineering

Abstract/Summary:

In recent years,the technology in the field of artificial intelligence has developed rapidly,and computer vision and other technologies have received widespread attention.Among them,human action recognition has become one of the research hotspots.It shows high application value in medical diagnosis,intelligent monitoring,and humancomputer interaction.For the research topic of human action recognition,researchers have made some research progress,but human action recognition technology in video still has many difficulties to be solved,such as serious occlusion of target objects in the area,complex video background,camera perspective,etc.A series of problems make it difficult to further improve the recognition accuracy.In the field of human action recognition,the traditional artificial feature extraction method has encountered a bottleneck.Mainstream deep learning method uses a convolutional neural network to simulate the human brain's understanding of video and image information,and extracts autonomous learning features,which greatly improves the recognition efficiency and accuracy.In order to further study the human action recognition method based on deep learning,the following work has been done:Aimed at the problem of mutual fusion and how to make full use of video time series information,a model of residual two-stream network and attention is proposed.The residual two-stream network fuses the temporal and spatial characteristics of the video to construct a Bi-LSTM model,making full use of the timing information of video frames.An attention model is introduced to assign different weights to the video frame sequence according to the output of the Bi-LSTM network at different times,and finally the Softmax function is used to complete the human action recognition task.Aimed at how to fully extract the spatio-temporal feature information of video,an attention-based spatio-temporal fusion network and a bidirectional one-way LSTM model are proposed.An attention mechanism is introduced on the basis of the spatiotemporal fusion network.In order to better capture global information,a two-simple To the LSTM model,the Softmax function is used to complete the task.Finally,the two models are trained and tested on the UCF101 and HMDB51 datasets,respectively,and the experimental results are analyzed.Results show that the models are very robust and both improve the accuracy of action recognition.Figure 39;Table7;Reference 48...

Keywords/Search Tags:

deep learning, residual two-stream network, attention, Bi-LSTM, double unidirectional connection

Related items

1	Research On Several Modeling Problems In Deep Learning Speech Recognition Systems
2	Research On Gaussian Noise Image Denoising Algorithm Based On Dual Channel Extended Convolution Attention And Residual-dense Block
3	Research On Deep Image Denoising Network With Attention Mechanism
4	Studies On Action Recognition In Video Based On Deep Learning
5	BiLSTM-CNN Text Classification Based On Attention Mechanism And Residual Connection
6	Image Super-Resolution Reconstruction Research Based On Deep Learning And Attention Mechanism
7	Design And Realization Of Recognition System Of Museum Visitors’ Violation Behavior
8	Named Entity Recognition Based On LSTM With Hierarchical Residual Connection
9	String Recognition Research Based On Deep Learning
10	Reasearch On Machine Reading Comprehension Based On Attention Mechanism