Font Size: a A A

Research On Cross-Modal Person Re-Identification Method Based On Multi-Grannularity Joint Learning

Posted on:2024-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:T YanFull Text:PDF
GTID:2568307115478764Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the popularity of surveillance cameras in recent years,the monitoring system of the whole society is constantly improving,and people have higher requirements for the intelligence of the monitoring system.Among them,person re-identification technology has important application significance and prospect in the realization of intelligent monitoring system..Person re-identification aims to achieve the matching of person with the same identity across the camera lens.When the illumination conditions are insufficient,there is a huge modal difference between the infrared image captured by the surveillance camera and the visible light image captured under sufficient illumination conditions.The difference between the modes makes it difficult to match person with the same identity,which increases the difficulty of human tracking.Person re-identification based on cross-modality aims to achieve person identity matching between visible light images and infrared images.At present,there are three main difficulties in cross-modal person re-identification.First,the captured images under the two modes are quite different;second,the cross-modal data set is single and small;the third is the intra-modal difference in traditional person re-identification.This paper mainly focuses on the research of person re-identification based on the deep features between cross-modal images,and proposes an improved method of cross-modal person re-identification based on multi-granularity joint learning.The main research contents and work of the paper are as follows:(1)Aiming at the problem that the person diversity of cross-modal person data sets is weak,and some insignificant or uncommon person feature information is easily ignored,an improved multi-scale cross-modal person re-identification method based on subspace sharing features is proposed.This method combines global and local features to learn different granularity representations of the two modalities,and extracts multi-scale and multi-level features from the backbone network.The coarse-grained information represented by global features and the fine-grained information represented by local features cooperate with each other to form a more discriminative feature descriptor.In addition,in order to enable the network to extract more effective shared features,a subspace shared feature module is proposed for the two modal embedding modes in the network,which changes the feature embedding method of traditional modal feature weights.The module is put into the backbone network in advance,so that the respective features of the two modes are mapped to the same subspace,and richer shared weights are generated through the backbone network.The experimental results on two public datasets prove the effectiveness of the proposed method.The average accuracy mAP of the SYSU-MM01 dataset in full search single lens mode reaches 60.62 %.(2)Aiming at the problem of huge difference between visible light image and infrared image,an improved cross-modal person re-identification method based on channel exchange multi-loss hybrid learning strategy is proposed.Firstly,a hybrid learning strategy using cross entropy loss and weighted square triple loss as identity ID loss is used to solve the problem of person identity classification within and between modalities.At the same time,the network is supervised to extract more effective modal shared features to form specific feature descriptors.In addition,for the cross-modal person image attributes,the channel exchange method is used to improve the robustness of the model to color changes.The experimental results on the public dataset SYSU-MM01 prove the effectiveness of the proposed method,and the average accuracy mAP reaches 58.68 % even in the full search single lens mode.(3)Aiming at the problem that the actual scene of cross-modal person re-identification is complex and prone to occlusion and how to extract more effective person sharing features,an improved method of cross-modal person re-identification based on heterogeneous center triplet loss is proposed.Firstly,a rectangular region is randomly selected from the channel exchanged image for random erasure.Through this method,enhanced images with different occlusion levels can be generated to enrich the training samples.Then,in order to expand the inter-class distance and improve the intra-class similarity,a heterogeneous center triplet loss supervision constraint network is proposed for learning.By replacing the comparison between the anchor sample point and all other samples by the Euclidean distance between the anchor sample center and all other sample centers,the computational cost is greatly reduced.The experimental results on the public dataset SYSU-MM01 prove the effectiveness of the proposed method,and the average accuracy mAP reaches 64.85 % even in the full search single lens mode.
Keywords/Search Tags:person re-identification, cross-modal, multiscale, global and local features, joint learning strategy
PDF Full Text Request
Related items