Font Size: a A A

Target Detection And Tracking Based On Multi-granularity Data Representation

Posted on:2020-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y P ZhouFull Text:PDF
GTID:2428330578464135Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Target detection and tracking is one of the research hotspots in the field of computer vision.It has broad application prospects in the fields of intelligent transportation,video surveillance and military.In order to overcome many problems in the process of target detection and tracking,the target feature representation is deeply explored.From the multi-granularity and multi-scale feature angles,the differences and correlations between different features are analyzed,and the features of different dimensions are integrated.Target detection and tracking tasks provide sufficient discriminating basis to improve the accuracy of the inspection tracking task.The specific innovations of this paper include the following three aspects.Firstly,aiming at solving the problems of missing and non-locating labels in many datasets in practical applications,a weakly supervised positioning method based on multi-scale feature convolutional neural network is proposed.The core idea uses the characteristics of neural network to generate gradient pyramid models by using gradient weighted class activation mapping on multi-layer convolutional layers.Besides,the feature centroid position is calculated by mean filtering operation and the connected pixel segments are generated by the confidence intensity map and the threshold clipping module.the weakly supervised positioning is performed around the maximum boundary label.The results verified on the standard benchmark show that the algorithm can achieve target positioning on datasets with high accuracy which have a large number of categories and multi-scale images.Secondly,aiming at the complexity of the application scenario,the difficulty of centralized object detection and the redundancy of the detection area,a framework of multi-scale rotating feature pyramid network is proposed.The multi-scale pyramid is used to improve the multiscale to construct the high-level semantic feature mapping.In the case of target rotation and over-concentration,a rotation anchoring strategy is designed to predict the minimum circumscribed rectangle of the object,thus reducing the redundant detection area.In order to ensure the accuracy of the small target and the difficult target,the input image is adaptively patched.The experimental results based on the standard benchmark show that the algorithm can complete the detection in high-definition images and a large number of targets.Finally,we propose a multi-branch with stochastic perturbation learning architecture based on convolution neural network.Compared with previous systems for visual tracking,the proposed architecture possesses three distinctive properties: the architecture not only use the multi-branch structure to learn the shared feature of the video sequence,but also use a kind of measure function to enhance the ability to learn the difference between those brunches;we propose a delay update strategy with stochastic perturbation in order to avoid trapping into the local optimal video sequence quickly;In the real-time tracking process,the video will search for the most similar sequence branch,meanwhile,we add the perturbation factor to randomize search to increase the space of solution.The experiments on standard benchmarks show the promising performance of the algorithm.This architecture shows good adaptability and robustness.The specific work of this paper is based on three types: target-level gradient back propagation fusion,convolutional network multi-scale feature fusion and rotating coordinate structure,and multi-branch solution structure of target tracking.Multi-granularity,multi-scale and multi-domain of target features are carried out.We achieve good results on standard datasets with high accuracy based on feature representation.
Keywords/Search Tags:Multi-scale, Small target detection, Gradient pyramid, Weakly supervised localization, Rotation region, High-level semantic, Multi-branch, Stochastic Perturbation
PDF Full Text Request
Related items