Font Size: a A A

Video Object Detection Based On Deep Learning

Posted on:2020-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:R LiuFull Text:PDF
GTID:2428330590984479Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Target detection is a research branch of computer vision,aiming to make computers have human visual ability.Target detection originated in the image field and its task is not only to detect the target categories included in the image,but also to find the location of target.The traditional target detection method mainly uses sliding window and rule block to extract the target candidate region,combining the HOG feature and the SVM classifier to detect the target.With the rapid development of deep learning technology,more and more researchers use its related technologies for target detection and propose some classical algorithms represented by Faster-RCNN and YOLO.Deep learning technology has made great achievements in the fields of image processing,semantic segmentation and pattern recognition and laid the dominant position in the field of target detection.However,as the accuracy of image target detection increases year by year,subsequent development has slowly entered the bottleneck period.More and more researchers transfer the target detection from the image field to the video field.Usually,the video contains a large number of images and there is a large amount of pixel redundancy between them.How to use the context information of the images to improve the speed and accuracy of target detection has become the focus of researchers in various countries.This paper focuses on the research of video object detection methods based on deep learning.Firstly,the related technologies such as neural network and deep learning are elaborated and the basic principle of target detection algorithm R-FCN is emphatically introduced.Secondly,this thesis deeply studies the principle of object detection by optical flow method proposed by Zhu et al and proposes three improvements on the basis of it: 1.Fuse time context information.When extracting the current image feature,the features of the before and after image frames are fused to make current image features have time context information.2.The continuous non-maximum suppression algorithm is used to delete the target candidate region extracted by the RPN network,so that relationship between the target candidate regions of the image frames before and after is used to improve the selection quality of the candidate region.3.Optimize the structure of the feature extraction network to make the network have a normalizing effect on the input data.Finally,the experimental results of the improved algorithm are given.The experimental results on the ImagenetVID dataset show that the improved method proposed in this paper has a higher score on the common quality evaluation criteria than the original optical flow method and the traditional image target detection algorithm.At the same time,our algorithm has achieved the same results as the current mainstream method.
Keywords/Search Tags:Target detection, deep learning, fusion feature, continuous non-maximum value suppression, network optimization
PDF Full Text Request
Related items