| Nowadays,video has become an important carrier of information.The video base is large and the number is increasing at an alarming rate.Most of them are untrimmed.Temporal action detection task aims to input an untrimmed video,locate the start and end time of the action instance,and recognize action categories.The key to effect of temporal action detection is how to accurately locate the temporal position of the action,that is,how to obtain high-quality temporal action proposal generation.Based on the two-stage framework,this paper considers how to generate high-quality temporal action proposal.The specific research contents are as follows:To solve the problem of inaccurate boundary location,this paper proposes a BoundaryMatching Network(S-BMN)based on self-attention.This method increases the temporal receptive field by introducing a self-attention mechanism in the convolution layer,so that the network can mine more complete features.In addition,a proposal relationship module with attention mechanism is also introduced,which mines the temporal information between proposals from the aspects of position and channel.The results on the Activity Net-1.3 dataset show that the AUC index of S-BMN has increased from 67.10 to 68.20,an increase of 1.1 points,reaching advanced performance.Further,the boundary matching method ignores the context aggregation in a long time range in the boundary prediction.At the same time,considering that most of the action duration is greater than a fixed value,it is not a good method to use the finest time granule to generate action candidates.Aiming at the above problems,this paper proposes a temporal action proposal generation algorithm based on coarse time granule(CTG-BMN).Specifically,a multi-path boundary prediction module(MPF)is designed to aggregate context information from the boundary level to achieve accurate boundary prediction.For the proposal evaluation,a coarse time granule proposal evaluation module(CPEM)is designed to generate reliable proposal evaluation scores.The results on Activity Net-1.3 dataset show that compared to S-BMN,the AUC index of CTG-BMN has increased from 67.10 to 68.20,an increase of 1.1 points,enabling accurate boundary prediction. |