Font Size: a A A

Research On Cross-Modality Person Re-Identification For Infrared And Visible

Posted on:2023-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:Q B HeFull Text:PDF
GTID:2568306794455224Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous upgrading and wide application of intelligent monitoring systems,pedestrian re-identification technology has attracted the attention of many scholars.Benefiting from the massive deployment of visible cameras and infrared cameras,the research on cross-modality pedestrian re-identification for infrared images and visible images has been greatly developed.Cross-modality pedestrian re-identification aims to match infrared images and visible images with the same identity in different scenes.Since there is a huge difference between infrared modality and visible modality,how to reduce the cross-modality difference is the key to cross-modality pedestrian re-identification.Identify the main directions of research.With the maturity of cross-modality pedestrian re-identification research,the method based on modality inter-transformation and the method based on metric learning have shown excellent performance to reduce the difference between infrared modality and visible modality.Most modality inter-translation methods utilize Generative Adversarial Networks to generate visible images corresponding to infrared images,or infrared images corresponding to visible images.However,the features contained in infrared images and visible images from different wavelengths are very different,and the images directly converted to the corresponding modes are bound to lose a lot of key information.In order to make better use of the feature information from the two modalities,this paper studies the fusion of the image information of the two modalities and reduces the difference between the two modalities.The main work is as follows:(1)In order to bridge the difference between infrared modality and visible modality,this paper proposes a Dual-modality Feature Fusion Module(DFFM),which generates fusion by extracting key features of infrared and visible images.The dual-modality feature fusion module extracts the spatial structure information of the infrared image and the color and spatial structure information of the visible image,and then calculates the channel attention of the infrared and visible modalities respectively,and applies the channel attention to the infrared and visible modalities.On the different channels of the modality,they are finally fused with each other to generate a three-channel fusion modality image.The fusion modality holds the color and spatial structure information of the infrared and visible modalities,bridging the difference between the infrared and visible modalities at the pixel level.(2)In order to facilitate the network to learn discriminative modality-invariant features between the infrared modality and the visible modality,this paper proposes a Multi-modality Center Aggregation Loss(MCA),which utilizes the infrared modality,the center distance between the visible modality and the fusion modality feature distribution replaces the distance between the three modality images,and dynamically aggregates the feature centers of the three modalities.On the one hand,optimizing the center distance can effectively reduce the cross-modality difference between different modalities,and promote the network to learn the discriminative common features between the infrared and visible modalities,thereby improving the feature similarity between the infrared images and visible images.On the other hand,it guides the training of the dual-modality feature fusion module to generate fusion modality images that are more beneficial to the network.(3)In order to reduce intra-class differences and expand inter-class differences,this paper proposes Multi-modality Triplet Loss(MMT).Taking the visible image,the fusion image and the infrared image as the target images,respectively,the positive samples and negative samples are selected from the other two modality images,so as to construct three triples,and calculate the multi-modality triples between the modalities loss,optimizing the relationship between multiple modalities to reduce intra-class cross-modality variation.At the same time,the intra-modality multi-modality triplet loss is used to reduce the intra-class difference within the modality and expand the intra-modality inter-class difference,that is the triples loss are calculated separately for the infrared modality,the visible modality and the fusion modality.Overall,this paper proposes three methods DFFM,MCA and MMT for cross-modality person re-identification,and verifies the excellent performance of the proposed method and the effectiveness of the module on multiple datasets.
Keywords/Search Tags:deep learning, cross-modality person re-identification, fusion modality, triplet loss
PDF Full Text Request
Related items