Font Size: a A A

Research On Action Detection Methods For Infrared Videos

Posted on:2022-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:X ChenFull Text:PDF
GTID:2518306575467214Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Action detection plays an important role in the video understanding task,and it attracts much attention in recent years.Current action detection methods are mainly adopted to analyze visible videos.However,they can not be applied to low-light environments due to the limited imaging characteristic introduced by visible videos.In the contrast,infrared videos possess the insensitivity to the illumination and resistance to background disturbances,which are suitable for action detection in low-light scenes.Besides,current researches for infrared videos mainly focus on the action recognition task,and few of them dive into the action detection,which is a more valuable task for real life applications.Considering the above background,this thesis studies the problem of action detection based on infrared videos.The specified content is summarized as follows:First,this thesis constructs a new infrared action detection dataset and builds an effective action detection framework for infrared videos.The model takes the entire video as input and constructs a lightweight optical flow estimation network to extract features from the original video.Different from conventional action detection frameworks,the designed framework optimizes the flow estimation network together with other components to make the final detection prefer to action related areas.A module that can automatically select different streams is designed based on the attention mechanism.It can be applied to common action detection architectures that adopt the multi-stream data fusion scheme and pay attention to distinguish features depending on the correlation between different streams and different time stamps.Experimental results on the infrared action detection dataset show that this method surpasses other action detection methods designed for visible videos.Meanwhile,it achieves state-of-the-art performances on two infrared action recognition benchmarks.Secondly,infrared action detection mainly adopted in the video surveillance application,which needs the real-time and efficient architecture.To reduce the huge consumption of the whole detection framework,this thesis proposes to reconstruct the feature extraction network,which is the most time-consuming part.Therefore,a new video reconstruction method is proposed to transform 3D formed video to 2D formed spatio-temporal image.To tackle this type of data,some modifications for original 2D convolution operations are designed.By combine the spatio-temporal image and the new convolution operator,the network can extract spatio-temporal features by consuming almost the same computation as general 2D networks.Besides,a self-supervised learning paradigm is proposed to help the model to understand the context of the spatio-temporal image.Finally,the proposed method can be integrated into the aforementioned infrared action detection framework,and it reduces the total amount of computation of feature extraction by 83%,while the performance is only reduced by less than 1%.It also can be applied to other tasks like action recognition and dynamic texture recognition and achieves competitive results,which verifies the generalization of the proposed backbone.
Keywords/Search Tags:infrared action detection, deep learning, optical flow estimation, self-supervised learning
PDF Full Text Request
Related items