Temporal Action Localization In Massive Multimedia Video Scenario

Posted on:2020-12-26

Degree:Master

Type:Thesis

Country:China

Candidate:H R Li

Full Text:PDF

GTID:2518306518464784

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Temporal action localization is a challenging task that requires not only the determination of the category of video clips,but also identification of the temporal boundaries(starting and ending time points)of the action instances in the untrimmed videos.During the era of big data,the explosive growth of multimedia information,such as videos,highlights the importance of automatic behavior analysis and detection.However,because of the complex scene information of real multi-video and the complexity of human behavior,it is still difficult to design a robust,portable and high-precision action localization algorithm.In the present study,we developed a novel method for effective extraction of spatial feature and temporal feature,robust and accurate generation of features,and achieving high-performance temporal action localization.The main innovations of this paper are:1)Attention-based feature extraction and fusion.We proposed an attention-based module which can adaptively extract important features and flexibly fuse spatial feature and temporal feature,to generating high semantic features.2)Explicit discrimination and extraction of long-short term features.We apply the two features extraction module to respectively extract long term features and short term features.The auxiliary losses are placed at the output of short term features extraction module to make two features extraction modules focus on their own feature extraction,boosting the accuracy of the algorithm.3)CNN-based long term feature extraction module.We apply CNN to extract long term features of videos,and use structure temporal pooling to dynamically adjust the receptive field of time domain.So,the receptive field of extraction module does not have maximum upper bound.Besides,our module can also effectively extract global features of actions that lasts longer.As a demonstration of its effectiveness,our method was used to achieve state-of-the-art performance on two challenging datasets,namely,THUMOS14 and Activity Net.

Keywords/Search Tags:

Action localization, Attention, Spatial-temporal feature, Video content analysis, Convolutional neural networks, Supervised learning

PDF Full Text Request

Related items

1	Research On Video-based Temporal Action Localization And Recognition
2	Temporal Action Localization And Action Recognition Based On Deep Learning
3	Relation Aware Network For Weakly-Supervised Temporal Action Localization
4	Research On Spatial-temporal Information For Action Recognition
5	Research On Deep Temporal Feature Learning Algorithm Based On Self-supervised
6	Research On Temporal Action Location Method Combining Light And Heavy Networks In Untrimmed Video
7	Deep Learning Based Temporal Action Localization
8	Study On Human Action Recognition Based On Non-local Spatial-temporal Residual Attention Mechanism
9	Algorithm Of Complex Action Recognition Based On Temporal Proposals
10	Weakly Supervised Temporal Action Detection In Untrimmed Video