Font Size: a A A

Research On Video Temporal Action Detection Method Based On Deep Learning

Posted on:2022-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y K LiFull Text:PDF
GTID:2518306494486604Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The research of video temporal detection task has high practical application value.According to different application scenarios,video temporal detection can be broadly divided into offline and online tasks.Application scenarios under offline settings do not need to consider real-time performance,so the algorithm can obtain corresponding results after processing the complete video,such as intelligent interception of highlights in sports games or film and television series,and screening of abnormal segments in industrial detection videos;Under the online setting,the algorithm needs to output the results in real time when only historical video information is obtained,such as automatic alarm for abnormal behavior of surveillance video,and intelligent auxiliary slow motion shooting of highlights on the mobile phone.The high similarity of information between video frames and the diversity of temporal lengths make the task of video temporal detection more challenging than recognition tasks.Under online settings,the design of the algorithm needs to further solve the problems caused by the lack of information below.The work of this paper mainly study from both offline and online directions.Specifically,the main research contents are as follows:(1)Aiming at the problems in the bottom-up method of temporal detection under offline settings,BMN is improved,and an offline temporal proposal generation method BMN++ based on global information improvement is proposed.Specifically,firstly,nonlocal modules are used to integrate temporal global information to improve the performance of the proposal evaluation network;secondly,the supervision of the boundary evaluation network are modified in the form of Gaussian distribution,which enhances the sensitivity of the network to boundary information.Experiments on public datasets verify the effectiveness of the improvement.(2)Facing the application requirements of real-time auxiliary shooting on the mobile phone,the online highlight start detection(OHSD)task was proposed and a suitable temporal detection dataset called Highlight45 under the short video scene of the mobile phone was constructed,which filled the short video online detection dataset vacancy.The average precision of the first detection is designed as an online evaluation metric for this task.The visualization results of specific categories show that this metric is more suitable for the online start detection task requirements than existing metrics.(3)The Highlight-Net end-to-end hybrid dual-stream network is designed for the OHSD task.Highlight-Net uses a dilated casual convolution network to model the motion information,and uses the post-fusion method to more effectively use the information of the two modalities.Aiming at the problem of high similarity of adjacent frame information in the starting area,a sequence contrastive loss is designed to supervise the network to increase the discrimination between the background and foreground optical flow feature modeling so as to make the final output more accurate.
Keywords/Search Tags:temporal action detection, nonlocal module, online start detection, sequential contrastive loss
PDF Full Text Request
Related items