| In surveillance video,due to the influence of factors such as angle of view,attitude change,lighting conditions,etc.,there are large differences in a specific target at different moments under the same camera or between different cameras.Especially in the different lighting conditions of day and night,the camera works in visible light and infrared mode,which brings great challenges to the problem of pedestrian re-identification.This paper studies the challenging problem of cross-modal pedestrian re-identification under different lighting conditions in day and night,and analyzes and designs the algorithm from the aspects of neural network structure and loss function.Compared with existing methods,the method designed in this paper achieves leading accuracy in both the two public datasets RegDB[1]and SYSU-MM01[2]in the cross-modal person re-id field.The main work of this paper includes the following three aspects:(1)The algorithm based on cosine distance is studied and improved,and a cross-modal person re-identification algorithm(LCCRF)based on cosine distance and knowledge distillation loss is proposed.The LCCRF algorithm overcomes the shortcomings of the cross-modal pedestrian re-identification algorithm based on the cosine distance loss,that is,only the modal gap between the visible light modal sample and the infrared modal sample is considered.The LCCRF algorithm also considers the visible light modal sample.Gap inside the modal with the IR modal sample.In addition,the LCCRF algorithm is the first to design a loss function in the feature extraction stage to shorten the distance between the two modes to reduce the burden of the feature mapping stage.Experiments on two public datasets SYSU-MM01 and RegDB verify the effectiveness of the LCCRF algorithm.(2)In order to make up for the defect that the traditional cross-modal person re-identification algorithm assumes that the pictures are aligned,this paper proposes an alignment-based cross-modal person re-identification algorithm(AlignF)based on the previous algorithm.The AlignF algorithm first proposes a new feature Transformer module to enhance the ability of the model to extract features.The use of this module helps the model to extract richer global and local information.Then an alignment module is proposed to enable the model to learn more discriminative feature representations,which further facilitates character representation learning.Experiments on two public datasets RegDB and SYSU-MM01 fully verify the robustness and stability of the AlignF algorithm.(3)In order to make up for the fact that most existing algorithms only focus on a single coarse-grained(or fine-grained)feature,this paper proposes a graph convolution-based cross-modal person re-identification algorithm(GraphF).The model structure adopts the overall structure of work 1,while using the feature Transformer module of work 2.On this basis,the GraphF algorithm proposes a new graph convolution module,which makes the model focus on both coarse-grained and fine-grained features,so that the features extracted by the model are more discriminative.The robustness and effectiveness of the GraphF algorithm are verified on two public datasets,RegDB and SYSU-MM01. |