Research On Video Behavior Recognition Technology Based On Spatio-Temporal Modeling

Posted on:2024-06-14

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Liang

Full Text:PDF

GTID:2568307106967589

Subject:Information and Communication Engineering

Abstract/Summary:

Video behavior recognition technology is widely used in many fields such as intelligent security and video retrieval,and is an important task for video understanding.Although excellent results have been achieved after applying deep learning to the field of video behavior recognition,there is still potential for further improvements in video behavior recognition technology.Firstly,the different behaviors in the video have different time span and uneven distribution,which makes it difficult to extract the spatial and temporal information in the video efficiently.In addition,the temporal structure in the video is complex and diverse,and how to effectively perform long-term temporal modeling is still a difficult problem in video behavior recognition.Aiming at the above problems,this thesis researches the video behavior recognition technology based on spatio-temporal modeling,and the main contents are as follows:(1)Aiming at the problem that it is difficult to extract spatial and temporal information in videos efficiently because different behaviors in videos have different lengths of time span and uneven distribution,this thesis proposes a spatio-temporal modeling behavior recognition algorithm based on behavior key frame sampling and temporal difference.The algorithm uses Res Net50 as the backbone network and adopts the behavior key frame sampling method to sample the video,first calculating the probability distribution of the amount of motion information in the video frames,then grouping the video frames by using the grouping strategy of evenly dividing the amount of motion information,and finally randomly sampling the video frames from each group,so as to achieve adaptive sampling of behavior key frames for different videos.The algorithm also proposes a temporal difference module,which obtains the temporal difference map by making the difference between the sampled frame and its two preceding and following frames,extracts the spatial features from the sampled frame to realize the spatial information modeling,extracts the temporal features from the temporal difference map to realize the temporal information modeling,and finally fuses the extracted spatial features with the temporal features to realize the temporal information modeling in the video.(2)Aiming at the problem that the complex temporal structure in video makes it difficult to effectively model long-term temporal information,this thesis further research on the already constructed temporal modeling behavior recognition algorithm and proposes a temporal modeling behavior recognition algorithm incorporating temporal adaptive motion excitation and attention mechanism.The algorithm proposes a temporal adaptive motion excitation module,which first enhances the motion channel and suppresses useless background information by computing the feature-level temporal differences between video clips,then generates video-dependent temporal adaptive convolution kernels based on the long-term temporal information of the video,and finally uses convolution to aggregate the long-term temporal information in the video to achieve long-term temporal modeling.In addition,the algorithm also incorporates a coordinate attention mechanism that encodes spatial coordinate information while constructing channel attention,enabling the network to precisely locate and enhance features related to behavior recognition,thus further enhancing the algorithm performance.In this thesis,the spatio-temporal modeling algorithm incorporating temporal adaptive motion excitation and attention mechanism is validated on the temporal-related Something-Something V1 dataset and the scene-related HMDB51 dataset,and the obtained behavior recognition accuracies are 51.2% and 73.5%,respectively.The experimental results show that the algorithm proposed in this thesis can effectively improve the accuracy of behavior recognition.

Keywords/Search Tags:

video behavior recognition, spatio-temporal modeling, behavior key frame sampling, temporal difference, temporal adaptive motion excitation, attention mechanism

Related items

1	Research On Spatio-Temporal Action Detection Based On Self-Attention
2	Research On Spatio-temporal Information Fusion Human Behavior Recognition Methods
3	Video Action Recognition Based On 2D Convolution Network Under Spatio-Temporal Feature Enhancement Mechanism
4	Research On Object Detection Method Based On Key Points And Graph Spatio-temporal Attention Mechanism
5	Research On Spatio-Temporal Indexing Mechanism And Querying Strategy
6	Research On Surveillance Video Synopsis Based On Spatio-Temporal Slice
7	Research On Spatial-temporal Action Detection And Recognition Algorithm Based On Lightweight Network
8	Research And Application Of Target Behavior Recognition Based On Deep Learning
9	Research On Surveillance Video Synopsis Based On Spatio-temporal Tube
10	Research On Video Behavior Recognition Method Based On Deep Learning