Video Behavior Analysis Based On Deep Learning

Posted on:2021-03-04

Degree:Master

Type:Thesis

Country:China

Candidate:W Zhu

Full Text:PDF

GTID:2428330611496890

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the development of the network,there are massive videos uploaded from surveillance,web cameras,and users on their own every day.Auditing and screening these videos by pure manual work requires a very large amount of work,which is almost impossible to complete.As a branch of computer vision,video behavior analysis has received a lot of attention from researchers and institutions because it has achieved certain research results and economic benefits in such areas as intelligent monitoring systems,video retrieval,and human-computer interaction.The traditional behavior recognition method is to identify the category of the behavior by designing and constructing a model representing the behavior,and analyzing the features extracted by artificial settings.However,the traditional human behavior recognition method based on artificial design features involves many links,has the disadvantages of large time overhead,and difficult to optimize the algorithm as a whole.In view of the above problems,this paper mainly researches and proposes a method of spatial temporal self-attention motion feature extraction based on deep learning.The main work is as follows:The method of current video behavior analysis is studied.In the traditional algorithm,the artificial feature extraction method,the design of the classifier and the mainstream i DT method in the traditional algorithm are studied.In the deep learning method,two-stream and TSN in the classic dual-stream method,C3 D in the 3D convolution method,etc.are studied.Research and design a spatial self-attention mechanism.It is difficult to extract key motion features for the complexity of video scenes.This paper proposes a3Dspatialself-attention mechanism.The attention distribution calculated by combining the motion information between the scene and the frame can focus on the effective motion area,and to a certain extent avoids the influence of irrelevant motion features.Factors such as differences in video shooting equipment and different video encoding methods may cause differences in different video frame rates,and the speed of the same action will also vary from person to person.Temporal 3D Res Ne Xt Blockis proposed for the above problems,The convolution in the time dimension uses a multi-scale convolution kernel,which increases the adaptability in the time span.In terms of model structure,this paper designs a pyramid structure to obtain the value matrix,query matrix,and key matrix of the spatial self-attention mechanism,so that the features of the input attention mechanism are more abundant.The experimental results on the UCF101 and HMDB51 data sets show that the method proposed in this paper can effectively improve the detection accuracy.

Keywords/Search Tags:

deep learning, video behavior recognition, Self-Attention, feature fusion, two-stream network

PDF Full Text Request

Related items

1	Research On Video Behavior Recognition Method Based On Deep Learning
2	Deep Feature Fusion And Attention Models For Video Action Recognition
3	Research And Implementation Of Human Behavior Recognition Technology Based On Surveillance Video Stream
4	Research On Human Abnormal Behavior Analysis Technology In Video Sequences
5	Studies On Action Recognition In Video Based On Deep Learning
6	Research On Video Behavior Recognition Algorithm Based On Deep Learning
7	Study On The Method Of Driving Behavior Recognition Based On Deep Learning
8	Behavior Recognition Based On Deep Learning And Its Application On Infrastructure Sites
9	Research On Spatio-temporal Information Fusion Human Behavior Recognition Methods
10	Action Recognition Based On Two Stream Spatial-Temporal Attention Network