Research On Temporal Action Detection Based On Accurate Boundary Prediction

Posted on:2022-04-06

Degree:Master

Type:Thesis

Country:China

Candidate:B Q Tang

Full Text:PDF

GTID:2518306533494664

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

The temporal action detection task is one of the current research hotspots in computer vision.This task can be defined as: taking naturally captured video clips as input,the output video contains the start time and end time of a specific action clip(temporal action proposal generation),and the specific category of that action(action recognition).In this paper,the temporal action proposal generation algorithm and the temporal action detection algorithm are explored and studied respectively.Regarding the temporal action proposal generation task,for the problem that the existing methods are difficult to accurately locate the action start and end boundary points,this paper proposes a temporal action proposal generation network based on Boundary Prediction-Precise(BP-P).First,BP-P improves the accuracy of action boundary localization by fusing the locallevel action features and proposal-level action features in the video sequence to more fully utilize the feature changes at the background and action demarcation points.Secondly,BP-P proposes a new loss function "Free-Focal Loss" to address the problem of imbalance between positive and negative samples and difficult and easy samples in the training process.The "FreeFocal Loss" can effectively improve the balance of the contribution of Io U samples in different intervals when the network weight gradient is updated.Finally,to address the problem that the large gradients of difficult samples in the joint training of classification and regression tasks are detrimental to the training,"Balanced L1 Loss" is introduced to improve the regression gradients of accurate samples.To demonstrate the effectiveness of the BP-P model on the temporal action proposal generation task,experiments are conducted on the publicly available dataset Activity Net-1.3.The experimental results show that BP-P can increase the AR@100metric from 75.01% to 76.56%,which is comparable to the current best performance on this dataset(76.75%).Regarding the temporal action detection task,the current one-stage framework has the advantage of high efficiency,while the two-stage framework achieves high precision.In order to inherit the advantages of both,while avoiding their disadvantages,this paper introduces the idea of fusing the one-stage and two-stage frameworks in the object detection RefineDet algorithm to the temporal action detection task for the first time,and proposes the 3D RefineDet temporal action detection algorithm.The algorithm constructs a 3D detection network applicable to video features by temporal generalization of 2D modules.To demonstrate the effectiveness of the 3D RefineDet algorithm on the temporal action detection task,experiments are conducted on the publicly available dataset THUMOS-14.The experimental results show that the 3D RefineDet algorithm achieves significant effect improvement at multiple Io U thresholds for the m AP@t Io U metric,and improves the m AP from 50.1% to 53.6% when the Io U threshold is taken as 0.3,an improvement of 3.5 percent.

Keywords/Search Tags:

Temporal action detection, Temporal action proposal generation, video understanding, deep learning

PDF Full Text Request

Related items

1	Research On Temporal Action Detection Based On Accurate Boundary Prediction
2	Research On Temporal Action Detection Algorithm Based On Boundary Matchin
3	Research On Temporal Modeling Method For Video Understanding
4	Video Action Detection Based On Deep Learning
5	Research On Temporal Action Detection Based On Relation Aware And Global Context
6	Research On Temporal Action Detection Based On Neural Network
7	Research And Implementation Of Video Action Detection Task Based On Deep Learning
8	Research On Temporal Action Location Method Combining Light And Heavy Networks In Untrimmed Video
9	Design And Implementation Of Context Cascade Network For Video Temporal Action Detection
10	Research On Algorithm Of Temporal Action Detection