Video Object Detection Based On Deep Learning

Posted on:2021-01-21

Degree:Master

Type:Thesis

Country:China

Candidate:R Y Zhang

Full Text:PDF

GTID:2428330614970730

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Video object detection has been considered as an essential as well as central topic in computer vision.It serves as a core technique in many practical applications,e.g.video surveillance,autonomous driving,military investigation and intelligent nursing.With the development of convolutional neural networks(CNNs),encouraging breakthroughs have been made on detecting objects from images.However,directly applying those techniques to video is often unsatisfactory due to the deteriorated appearance caused by movement of camera or objects in videos.These deteriorated frames are difficult to detect.To achieve satisfactory result,image detecting techniques as well as temporal information exists in video is needed.Existing methods which achieve great performance perform feature aggregation along the motion path based on optical flow.However,problems exist when feature fusion are performed: Firstly,the differences of each frame are not taken into account when fusing estimated features from adjacent frames.Secondly,there exists gap between optical flow and high-level features.This thesis is to analyse and solve these shortcomings in existing methods and promote to detect.The main contributions are summarized as the following:(1)A video object detection method based on adaptive multi-frame feature aggregation is proposed.Our method is based on MANet.When performing multi-frame feature fusion along the motion path,the MANet does not consider the differences between the propagated features from different frames,and endows each frame the same weight.We introduced a contribution module to generate adaptive weights based on the content of video frames,so that richer dynamic information can be obtained.Experiments on Image Net VID shows the effectiveness of our method.(2)A video object detection method based on motion and attention is proposed.High-level feature estimation and aggregation along the motion path based on optical flow network is a popular way to absorb temporal information in video object detection.While flow estimation can lead to bad feature alignments between frames in video.We introduce attention mechanism to the task of video object detection,and calculates the attention of each adjacent frame relative to the reference frame to capture temporal context information.So feature estimating feature can be obtained directly in high-level feature space,The accuracy of the algorithm has been further improved.(3)After analyzing and improving the accuracy of algorithm,we design andimplement a video object detection system based on deep learning.This part is based on the algorithm in Chapter 4.We introduced the application scenarios,flow chart of the system,and model display of system interface accordingly.After the system platform was built,we tested the system,and the detection system perform well,laying a foundation for further investment in practical scenarios.

Keywords/Search Tags:

Convolutional neural network, Video object detection, Optical flow estimation, Adaptive weights, Attention mechanism, System platform

PDF Full Text Request

Related items

1	Research On Object Detection Model Based On Convolutional Neural Network
2	Research On Optical Flow Prediction Algorithm Based On Improved Convolutional Neural Network
3	Research On Video Object Detection Based On Multi-Attention Mechanism
4	Video Object Detection Based On Adaptive Convolution Network And Visual Attention Mechanism
5	Research On Video Object Detection Based On Neural Network
6	Video Saliency Detection Based On Improved Attention Network And Data Augmentation
7	Video Object Detection Based On Attention Mechanism And Multi-Scale Feature Fusion Convolutional Network
8	Salient Object Detection And Application Based On Attention Mechanism Convolutional Neural Network
9	Research On Video Object Detection Based On Feature Propagation And Fusion
10	Research On Unsupervised Video Multi-object Segmentation Algorithm