Font Size: a A A

Study On Deep Neural Network's Stability And Application In Video Object Detection

Posted on:2021-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:H M DengFull Text:PDF
GTID:2518306503972189Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Recent approaches interpret deep neural works(DNNs)as dynamical systems,drawing the connection between stability in forward propagation and generalization of DNNs.In this paper,we take a step further to be the first to reinforce this stability of DNNs without changing their original struc-ture and verify the impact of the reinforced stability on the network represen-tation from various aspects.More specifically,we reinforce stability by mod-eling attractor dynamics of a DNN and propose relu-max attractor network(RMAN),a light-weight module readily to be deployed on state-of-the-art ResNet-like networks.RMAN is only needed during training so as to mod-ify a ResNet's attractor dynamics by minimizing an energy function together with the loss of the original learning task.Through intensive experiments,we show that RMAN-modified attractor dynamics bring a more structured representation space to ResNet and its variants,and more importantly im-prove the generalization ability of ResNet-like networks in supervised tasks due to reinforced stability.Besides studying the property of deep neural networks,this paper fo-cuses on exploring deep neural network in video object detection.Video object detection is more challenging than image object detection because of the deteriorated frame quality.To enhance the feature representation,state-of-the-art methods propagate temporal information into the deterio-rated frame by aligning and aggregating entire feature maps from multiple nearby frames.However,restricted by feature map' s low storage-efficiency and vulnerable content-address allocation,long-term temporal information is not fully stressed by these methods.In this work,we propose the first object guided external memory network for online video object detection.Storage-efficiency is handled by object guided hard-attention to selectively store valuable features,and long-term information is protected when stored in an addressable external data matrix.A set of read/write operations are de-signed to accurately propagate/allocate and delete multi-level memory fea-ture under object guidance.We evaluate our method on the ImageNet VID dataset and achieve state-of-the-art performance as well as good speed-ac-curacy tradeoff.Furthermore,by visualizing the external memory,we show the detailed object-level reasoning process across frames.
Keywords/Search Tags:deep neural network, computer vision, video object detection, external memory, attention mechanism
PDF Full Text Request
Related items