| Detecting the concealed objects carried by the human at airports,stations and other public places with a large flow of people is one of the important means to ensure the safety of people’s travel.The millimeter electromagnetic wave frequency spectrum is between the microwaves and the infrared,which can penetrate human clothing to detect concealed objects,and is harmless to the human body due to the non-ionizing radiation.Therefore,the concealed object detection in millimeter-wave human images has become a hot research topic.In recent years,deep learning has achieved breakthrough results in computer vision tasks in natural images,such as image classification,image segmentation,and object detection.Driven by these remarkable achievements,deep learning-based concealed object detection methods for millimeter-wave human images have been explored.However,most of these methods simply use the existing object detection methods for natural images to learn the semantic representation of the object and the background in millimeter-wave images to detect several common large-size concealed objects.Compared with natural images having high resolution,rich texture,strong contrast,and large objects,millimeter-wave human images have the characteristics of low resolution,strong background noise,low imaging quality,dim-small objects,and low contrast.Directly using object detection methods in natural images obviously cannot achieve good detection performance.This dissertation studies the problems of the dim-small object detection and the strong background noise suppression in the concealed object detection in millimeter-wave human images.Based on the existing object detection approaches,four methods are proposed to improve the performance of concealed object detection in millimeter-wave images.The main works and innovations are as follows:(1)For the detection of dim-small objects and information-missed concealed objects,a selfpaced feature attention fusion network is proposed.This detection network integrates multiscale features containing details and semantics in a top-down manner to reduce the information loss of small objects.Simultaneously,a hierarchical pyramid attention composed of channel-and spatial-attention is used to fuse multi-scale features in a top-down manner.The attention mechanism focuses on extracting the object representation and highlighting the object.In addition,boosting self-paced learning is proposed to learn samples from easy to difficult.After learning all samples,it focuses on learning hard samples to improve the network’s ability to distinguish dim-small objects.To verify the effectiveness of the proposed method,experiments were conducted on two real-world millimeter-wave human concealed object detection datasets,AMMW-Hi SC and PMMW.The experimental results illustrated that the proposed method enhanced the dim-small object features.(2)To address the problem of strong noise interference in millimeter-wave human images,a structural context-based concealed object detection network is proposed,which uses nonlocal information extracted from structural regions to learn localizable semantic features while suppressing the interference of background noise.The method uses local and nonlocal spatial relations to mine the difference between objects and noise to learn localizable features of objects.The proposed detection method consists of two sub-networks: a multiscale weakly supervised feature refinement network and a local context-based concealed object detection network.First,a multi-scale weakly supervised feature refinement network is constructed to perceive the localizable features of different sizes objects and suppress background noise using contexts in structural regions.Specifically,the multi-scale pooling module is introduced to capture the localizable features of different size objects,and the object-activated region enhancement module is exploited to enhance the object semantic representation of the multi-scale pooling features and suppress background interference.Secondly,the adaptive local context aggregation module is used in the concealed object detection network to integrate the local context around the object and improve the model’s ability to distinguish dim-small objects.Experimental results on two real-world human concealed object detection datasets,AMMW-Hi SC and PMMW,show that the proposed method suppresses the background noise and reduces the false alarms.(3)Aiming at the problem of the smallness of the sample size of millimeter-wave human dataset,and low sample distinguishability,a collaborative knowledge injection-based discriminative feature learning method is presented,which uses knowledge representations extracted from an external millimeter-wave object database to guide the discriminative representation learning of concealed objects in human body images to be detected.The object features are captured by learning the semantic relationship between the prior object knowledge and the object to be detected.Specifically,an object image database is firstly obtained based on the Cut-Scale-Paste method.Then,a weight-unshared dual-branch network is used to learn the object knowledge representation in the constructed object database and the representation of the object to be detected,respectively.Afterwards,a supervised correlation learning module is proposed to inject the knowledge into the detection network to guide the representation learning of the object to be detected,capturing the discriminative representation of the object and the background.Experimental results on AMMW-Hi SC and THz human datasets show that the proposed method improves concealed object detection performance on both single-class and multi-class when introducing external object knowledge.(4)To alleviate the performance degradation induced by the large difference of gray value and appearance of millimeter-wave images,a trusted depth knowledge distillation network is proposed,which uses the depth image to guide the feature learning and optimization of intensity images to obtain better detection results.A high-quality depth image can provide object shape,position and other information.But the depth image acquired by millimeterwave imager is of poor quality,in which the depth value is missing.Therefore,the conditional diffusion model is introduced to complete the depth image with missing information by using the intensity image as a conditional constraint.Then,the trusted depth knowledge is distilled into the intensity image based on the classification confidence to improve the object detection performance.The experimental results on the multi-modal millimeter-wave human dataset show that the trusted depth knowledge distillation method based on classification confidence filters out the depth information with side effects,and incoporates the trusted depth information into the detection network. |