Research On Visual Object Detection And Tracking Technologies Based On Deep Learning

Posted on:2021-04-19

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Q Feng

Full Text:PDF

GTID:1488306548491764

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

As a core component of many intelligent platforms,computer vision system will play an extremely important role in industry 4.0,smart cities,autonomous driving and the future unmanned combat.Visual object detection and tracking as important basic tasks of computer vision,are prerequisites for many advanced vision applications.In recent years,with the improvement of computing power,the advancement of big data technology and the development of deep learning technology,the technologies of visual object detection and tracking have achieved great progress.However,due to the influence of many factors such as noise,the existing visual object detection and tracking technology can not meet the actual application requirements.Therefore,based on the theory and method of deep learning,this dissertation studies the technology of visual object detection and tracking.The main research works abstracted as follows.(1)The conventional visual object detection models based on deep neural networks have good performance in conventional detection scenarios,but when detecting multi-scale objects,problems of low recall and poor positioning accuracy often occur.Although traditional multi-scale processing methods can improve detection accuracy,they usually generate excessive calculation redundancy,resulting in low detection efficiency.In this dissertation,a multi-scale object detection method based on Faster-RCNN is proposed.The proposed model can effectively improve the proposed recall rate and detection accuracy for small objects and multi-scale objects.First of all,through qualitative and quantitative analysis,the difficulties of multi-scale object detection are pointed out and the impacts of pooling operation,as well as object's multi-scale characteristics on region proposal of the two-stage detection algorithm are analyzed.This results in a modified solution for region proposal to improve recall rate of multi-scale objects.Secondly,for multi-scale objects,the strategy for generating training samples is improved,which reduces the number of invalid foreground samples in the training process and makes the training of regional proposal networks more efficient.Thirdly,the multi-level features fusion is used to enhance the features in the proposal regions,which effectively improves the recognition and localization accuracy of small objects.Finally,the effectiveness of the improved method is verified on the actual data set.(2)The deep network-based tracking model CREST shows significant advantages in tracking accuracy and robustness,but it has the problems of too long training period,low online update efficiency and high requirements for hardware devices,these disadvantages limit the application scenario of the model.For these reasons,the CREST tracking model is improved from two aspects in this dissertation.On one hand,the computational efficiency of the CREST model is improved.Firstly,the reason for the low efficiency of training and online updating is identified,as its too large convolution kernel size,which results in excessive calculations and seriously affects the forward and backward calculations of the model.For this issue,we redefine the base layer as an explicit correlation filter and use frequency domain parameters to describe the base layer.During the initial training and online updating stages,the correlation filter algorithm is used to pre-train the base layer to efficiently learn the principal information of the object's appearance,and the back-propagation algorithm is followed to tune the entire model.Then,both forward and backward calculations in the improved base layer can be accelerated in frequency domain,which improves the tracking speed,training speed and online update efficiency.Extensive experiments on the evaluation data set verify the performance of the improved model.On the other hand,considering the large scale of model parameters and the high demands on computing resource,a variety of lightweight technologies are used to improve the CREST tracking model.The improvement greatly reduces the amount of model parameters and the hardware requirements for model running.(3)In the process of visual object tracking,especially the long-term tracking,it is important to evaluate the confidence of the tracking results for analyzing the current tracking status and taking different measures to deal with different status.The existing confidence analysis methods are usually based on the heat map of the object position predicted by the tracker.This kind of evaluation method has many shortcomings: first,the index value is completely generated from the tracking result and has a strong subjectivity;second,when the object is slowly obscured,the index tends to be artificially high and it is easy to cause tracking drift;third,when the appearance of the object changes rapidly,the index value tends to be artificially low and the model might lose the chance to update,thereby losing the object in subsequent frames.The fourth is that the index value cannot be directly used as the criterion of the tracking status because the tracking status of the object does not have a direct correspondence with the change of the index.Therefore,it is still a complicated pattern recognition problem to use these indexes to judge it is obscured or lost.This dissertation proposes a model of weak supervision segmentation for the object in the tracking results,and based on this model an objective confidence evaluation method is provided to effectively overcome the above problems.Firstly,the method of training the semantic segmentation model using weak annotation(image-level label)data is studied,and the tracking results are segmented using the obtained model to obtain the object segmentation mask.Secondly,the difference analysis model of tracking results and segmentation results is established,and the confidence of tracking results is evaluated by this model.Finally,experiments were carried out in various tracking scenarios and the results illustrate the effectiveness of the evaluation method.(4)When the visual object tracking is applied to practical tracking scene,due to the complexity of the environment,tracking failures,caused by occlusion or out-of-sight,often occur.The short-term tracking model is usually based on the assumption that the object continuously appears in the field of view.Therefore,it is difficult to track in these scenarios.These challenges can be overcome by the long-term tracking model.At present,the long-term tracking visual tracking is still an open and challenging problem.Inspired by the TLD framework and with an emphasis on the role of confidence evaluation,we take confidence assessment as an important part in long-term tracking and propose a long term tracking method based on the framework of tracking-evaluation-learning-detection.Firstly,a re-detection algorithm based on fully correlation filter is provided.This algorithm can fully use the output resources of the confidence evaluation model to effectively reduce the amount of calculation during re-detection and improve the detection efficiency.Secondly,the long term tracking framework is designed,which includes the interaction and update strategies for its short-term trackers,confidence evaluation model and the re-detector.The designed strategies can effectively reduce the probability of "false object loss" and the number of re-detections.Thirdly,combined with the improved short-term tracking model in the dissertation,the reset efficiency of the tracker can be effectively improved after the object is re-detected.Finally,the effectiveness of the framework is tested on the long-term tracking evaluation data set.In addition,by employing the Mat Conv Net deep learning framework,a long term object tracking software is developed with the VC++programming language.(5)Based on the previous research on visual target detection and tracking technology,a visual target detection and tracking software is designed.Firstly,the overall design of visual object detection and tracking software is introduced,including design overview,requirement analysis,function design and interface design.Secondly,the detailed design of the core module in the visual object detection and tracking software is introduced.Finally,the visual object detection and tracking software is implemented based on VC + + and MATLAB.

Keywords/Search Tags:

deep learning, computer vision, object detection, object tracking, tracking confidence

PDF Full Text Request

Related items

1	A Study Of Object Tracking In Complex Scenes Based On Compute Vision
2	Research On Moving Object Tracking Algorithm Based On TLD Framework
3	Study On Object Tracking By Detection
4	Research On Vision Object Tracking Based On Deep Learning
5	Research On Single Object Tracking Based On Deep Learning
6	Tracking Based On Detection With Deep Learning And Kernelized Correlation Filters
7	Research On Deep Feature-based Visual Object Tracking
8	Research On Pedestrian Multiple Object Tracking Based On Deep Learning
9	Study On Methods And Implementation Of Fast Object Automatic Detection And Tracking
10	Weakly Supervised Object Detection For Specific Scene Images