One-Stage Temporal Action Detection For Open-Set Scenario

Posted on:2024-09-18

Degree:Master

Type:Thesis

Country:China

Candidate:J S Hu

Full Text:PDF

GTID:2568306932955819

Subject:Information and Communication Engineering

Abstract/Summary:

Temporal signal event detection aims to detect the start and end times of the interested events from continuous temporal signals,while identifying the category of the events.It has a wide range of applications in daily life and military fields.This dissertation takes temporal action detection as a practical task and studies how to detect human action events that occur in untrimmed video signals,while simultaneously recognizing action categories and detecting start and end times.Although some progress has been made in temporal action detection,existing methods still suffer from issues such as inaccurate action boundary detection and high false positive rate in practical applications.These problems seriously hinder the practical application of temporal action detection technology.In response to the above challenges,this dissertation studies the one-stage temporal action detection method under open set scenes,the main research contents include:1)To address the issue of inaccurate boundary detection of one-stage temporal action detection methods,this dissertation proposes a boundary and region based action proposal confidence evaluation algorithm,which utilizes both global and local information to measure the confidence of the action proposal.Through the joint utilization of global and local information,the algorithm achieves enhanced accuracy in confidence evaluation,and as a result,increases the accuracy of boundary detection.Experimental results indicate the method achieves the best performance among two commonly used benchmark datasets.2)To tackle the issue of a high false positive rate in action detection,this dissertation proposes a method for temporal action representation based on the minimization of sharpness.To capture long-term action dependencies while avoiding the interference of background information,the self-attention mechanism of Transformer is used to extract human motion-related details,resulting in high-quality action representation.Additionally,the network is trained using the sharpness minimization algorithm,leading to higher generalization capability.Experiments on multiple datasets show that our method outperforms the current state-of-the-art open-set models.

Keywords/Search Tags:

Video Understanding, Open-Set Temporal Action Detection, Temporal Action Localization, Generalization

Related items

1	Temporal Convolutional Network Based Temporal Action Detection
2	Research On Temporal Action Detection Algorithm Based On Boundary Matchin
3	Research On Temporal Action Detection Based On Accurate Boundary Prediction
4	Research On Temporal Modeling Method For Video Understanding
5	Research On Temporal Action Location Method Combining Light And Heavy Networks In Untrimmed Video
6	Research On Temporal Action Localization And Sentence Query Localization In Videos
7	Design And Implementation Of Context Cascade Network For Video Temporal Action Detection
8	Research On Coarse-to-fine Action Understanding Technologies For Video
9	Video Action Detection Based On Deep Learning
10	Research On Temporal Action Detection In Video