Font Size: a A A

Temporal Action Proposal Generation Based On Multi-modal Information Fusion

Posted on:2022-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:S J SunFull Text:PDF
GTID:2518306494486444Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of the number of videos on the Internet,video content analysis methods have aroused widespread concern in academia and industry.Temporal action detection is a fundamental and significant task in the field of video understanding,which aims to detect the start time,end time of an action and the action type in in a long untrimmed video.The existing methods in action recognition are relatively mature,but they haven't achieved good detection performance on the task of temporal action proposal generation.Therefore,more and more researchers are devoted to the task of temporal action proposal generation in untrimmed videos.This task only needs to detect the beginning and the ending of the action.On the other hand,the rapid development of sensor equipment has brought multi-modal data to the field of artificial intelligence,and the multi-modal information fusion method has also received extensive attention from academia and industry.This article has deeply studied the temporal action proposal generation method based on multi-modal information fusion.The specific content can be divided into the following three aspects:First,this article collects and processes a large multi-modal temporal action detection dataset.At present,there are few public datasets for temporal action detection,and their modalities are relatively single.The dataset of this subject has greatly enriched the research data in this field,and provided the possibility for the research of time series action detection tasks and related tasks in the field of multi-modal learning.Second,this paper proposes a phase-sensitive temporal action proposal generation method PSM(Phase-Sensitive Model).This method evaluates the confidence score maps of different action phases for all possible temporal proposals,after calculating the boundary probability through the boundary phase confidence maps,the confidence scores of proposals are given by fusing the scores of boundary and action and sorting,then this method can obtain high-quality temporal proposals,and has achieved good performance in experiments on related datasets.Finally,referring to the previous multi-modal information fusion methods,this article conducts early fusion and late fusion experiments on the temporal action proposals generation dataset,and analyzes and discusses the results.This paper finds that when the results of the two modalities differ greatly,the result of multi-modal information fusion method is not necessarily better than using only a single modal,and when the results of the two modalities are similar,the multi-modal information fusion method can get better results.
Keywords/Search Tags:Temporal action proposal generation, Temporal action detection, Multi-modal information fusion, Deep learning
PDF Full Text Request
Related items