Person re-identification is a technology that uses computer vision technology to determine whether there is a specific pedestrian in an image or video sequence.In recent years,with the continuous development of deep learning technology and the continuous expansion of the scale of person re-identification datasets,person re-identification technology under ideal conditions has developed to a great extent.But there are still many challenges in the non-ideal situation.Cross-modal person re-identification and cross-domain person re-identification are two examples.These two problems can be summarized as person re-identification under different imaging conditions and different scenarios.Imaging conditions and imaging hardware of person body image differ between day and night.We use visible light cameras during the day,but at night we can use nearinfrared cameras to clearly capture the human body.The characteristics of the images captured by the two cameras are very different.It is necessary to use the human body pictures taken by the visible light camera to retrieve the human body pictures taken by the near-infrared camera,and the human body pictures taken by the near-infrared camera to retrieve the human body pictures taken by the visible light camera.This is a cross-modal person re-identification problem.In addition,the pretrained model will suffer from severe performance degradation on an unseen target domain.Therefore,how to improve the cross-domain generalization performance of the person re-identification model is particularly important.This research direction is called cross-domain person re-identification.Aiming at the problem of cross-modal person re-identification,we design a embedding enhancement model based on dual-path heterogeneous graph.Taking the existing deep learning cross-modal person re-identification model as the baseline model,the baseline features of different modalities extracted from the baseline model are used to construct a heterogeneous graph.A neighbor tree search method is designed to remove the noise existing in the heterogeneous graph,and a dual-path aggregation and fusion method is designed to enhance the existing baseline features.In this way,identity semantic information is enhanced in both near-infrared and visible light modal features,and redundant and interfering information is suppressed.Experiments show that the method can effectively improve the performance of cross-modal person reidentification.On the SYSU-MM01 dataset,the mAP increases by 12.9%compared to the baseline method,and on the RegDB dataset,the mAP increases by 16.3%and 17.0%on visible-to-thermal and thermal-to-visible modes,respectively.Aiming at the problem of cross-domain person re-identification,pedestrian datasets in real monitoring scenarios are collected,and a person re-identification benchmark that can reflect the ability of cross-domain generalization is constructed.At the same time,various simulated pedestrian data were produced,and the value of the simulated data set in improving the performance of cross-domain person re-identification was explored.Experiments show that the simulation data can effectively improve the generalization ability of person re-identification in cross-domain conditions.The multi-task collaborative training method on both real and simulated data improves mAP by 2.2%in crossdomain scenarios. |