Font Size: a A A

Research Of Object Detection Based On Multi-modal Images

Posted on:2020-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:F YangFull Text:PDF
GTID:2428330575958419Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Object detection is one of the most popular topics in the field of computer vision.Its research results have important application prospects in many fields such as military,agriculture,medicine,security and so on.In recent years,with the in-depth study of deep learning technology in computer vision,the object detection task has made great progress and the detection accuracy has been continuously improved a lot.However,its application in reality still faces great challenges.In areas such as military and security,traditional RGB single-modal images have very large limitations,which seriously restricts the improvement of object detection accuracy in these scenarios.In the past few years,more and more researchers have found that the introduction of multi-modal data is helpful to obtain high-performance detectors,and research of object detection based on multi-modal data is becoming more and more popular.However,current research of multi-modal tasks does not discuss the characteristics of multi-modal data itself.This paper focuses on two problems found in multi-modal data,namely the mismatch problem between image pairs and information missing between modalities.The main work of this paper is divided into the following points:1)The reason for the mismatch problem in multi-modal data is analyzed,which proves that this problem is vulnerable and difficult to avoid in multi-modal data.Experiment results verify that the mismatch problem in multi-modal data is an important factor affecting the fusion phase of multi-modal data.2)The reason for the problem of information missing between modalities in multi-modal data is analyzed.It is proved that this problem is vulnerable and difficult to avoid in multi-modal data,and the influence of this problem on the detection network is discussed.3)Based on the discussion of the above two problems,the structure of the multi-modal object detection network is designed,and a step-wise training method is proposed by this work,which has achieved good detection results.4)In this paper,a RGB and infrared dual-modal dataset is constructed.The image pairs in this dataset have higher resolution and contain more pairs of images taken in different kinds of scenes.
Keywords/Search Tags:Object Detection, Multi-modality, Deep Learning, Mismatch, Modal information missing
PDF Full Text Request
Related items