Font Size: a A A

Temporal Information And Multi-Scale Fusion Based Video Object Detection

Posted on:2021-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:B Y ZhaoFull Text:PDF
GTID:2518306050970459Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The development of deep learning has made breakthroughs in image object detection problems,and the related research on deep network video object detection problems is still in the preliminary exploration stage.Due to the limitations of the data collection device itself,there are some problems in the video that are different from the image,such as virtual focus,motion blur,occlusion,rare gestures,etc.At the same time,continuous video frames form a contextual relationship in time sequence,which is not available in the image.Timing information.This article is mainly based on the deep network framework of image object detection,using the inter-frame correlation of video in time series to improve the detection effect of difficult-to-divide video frames,and further enhancing small and medium video objects by fusing multi-scale feature maps containing time series information.The detection ability of this paper will also apply the proposed method to the aircraft detection of remote sensing video.Our main work is as follows:(1)A video object detection method based on adaptive timing correction mechanism is proposed.The attention mechanism is a common method to deal with sequence problems.This method is inspired by the attention mechanism.It models the video timing.It first uses the basic deep network to obtain the global features of each frame of the video,and then mines the local neighborhood between adjacent frames.The correlation of features corrects the local features of each frame in an adaptive weighted manner.The modified features contain context information on the video timing,which is more robust.Verification and analysis on the Image Net VID data set shows that this method is better than the method of performing frame-by-frame object detection of video by Faster R-CNN(Faster Regionbased Convolutional Neural Network),which improves m AP by 8%.(2)A video object detection method based on time series multi-scale feature fusion is proposed.In recent years,the feature pyramid network can effectively integrate the advantages of multi-scale feature maps,which is well reflected in the detection of small objects.This method draws on its structural mechanism,first uses the deep feature extraction network to obtain multi-level global features with different image scales in each frame of the video,and then uses an adaptive timing correction mechanism to use local neighborhood features between adjacent frames to each scale feature After correction,the feature maps with multiple scale corrections are fused through upsampling and horizontal connection.The fused features include not only the timing information of the video,but also the multi-scale information of each frame image with local neighborhood correlation.Comparison experiments with methods such as DFF(Deep Feature Flow)show that our proposed method improves the detection ability of small objects in video sequences.The m AP of this method is improved by 1.8% compared with the DFF method.(3)In view of the characteristics of high resolution of remote sensing video,small object,and large scale change,the overall process of dealing with the problem of remote sensing video object detection is elaborated,and the reading,blocking,data enhancement,and merging of remote sensing video frames are introduced.And other processing methods,and emphatically analyzed the effect of different blocking methods on the complete detection of the object at the edge of the block.Finally,taking two remote sensing videos as examples,the video object detection methods proposed in(1)and(2)were used to test the remote sensing video containing aircraft objects.The experimental results show that our method can detect remote sensing well The aircraft object in the video also has a good detection effect on small objects.
Keywords/Search Tags:video object detection, remote sensing video, attention mechanism, multi-scale features, deep learning
PDF Full Text Request
Related items