Research On Action Recognition Method Based On Motion Feature Extraction And Spatio-temporal Feature Fusion

Posted on:2022-02-20

Degree:Master

Type:Thesis

Country:China

Candidate:X C Liu

Full Text:PDF

GTID:2558307154475144

Subject:Engineering

Abstract/Summary:

Since the invention of camera technology,video has become an important information storage medium.With the rise and popularization of short videos in recent years,the attention paid to video understanding has increased.Human action recognition in video is a challenging task in video understanding.Its main goal is to model and classify human actions in videos.In order to effectively extract motion features and fuse spatio-temporal features,this paper focuses on the two links of input frame and feature map,and proposes effective methods for their respective problems:1.An action recognition method based on local motion modeling is proposed to supplement motion information for a single input frame in a local time window.A single frame within a local time window can provide static information,but cannot express motion information.This method adopts the strategy of combining sparse sampling and dense sampling to obtain the inter-frame RGB difference in each local time window,encoding it in the early stage of the network,and capturing the local motion features.Then the motion features are added to the original features of the input frame to enrich the input information,which lays the foundation for the subsequent modeling of the network.2.An action recognition method based on channel attention and spatio-temporal information fusion is proposed to emphasize action-related features and fully integrate spatiot-emporal context information.This method first calculates the motion-related attention weight of each channel in the feature map through the channel-level attention mechanism,and then adjusts the feature value according to the weight to enhance the motion-related features.Then,it uses the Temporal Shift Module to move part of the channels along the time dimension to promote the exchange of information between adjacent frames,fully integrate spatio-temporal context information without adding additional parameters,and solve the problem that the 2D convolutional network is difficult to model the temporal relationship.This paper conducts systematic experiments on multiple benchmark data sets to verify the recognition accuracy of the two methods proposed in this paper when used alone and in combination.The experimental results prove the effectiveness of the method proposed in this paper.

Keywords/Search Tags:

Computer vision, action recognition, video understanding, motion feature extraction, attention mechanism

Related items

1	Research On Video Action Recognition Model Based On Convolutional Neural Network With Attention Mechanism
2	Research And Implementation Of Video Action Recognition Based On Feature Fusion And Hybrid Attention Mechanism
3	General Interactiing Object Detection Algorithms For Action Understanding
4	Key Techniques Of Content-based Intelligent Video Surveillance And The Applications In Public Security
5	Research On Coarse-to-fine Action Understanding Technologies For Video
6	Research On Key Technology Of Action Recognition Based On Visual Perception
7	Research And Application Of Temporal Action Detection In Videos
8	The Research And Application Of Action Detection And Recognition In Online Video Surveillance
9	Research On Computer Vision Based Human Action Recognition Technology
10	Research On Video Action Recognition Method Based On Spatio-temporal Feature Modeling