Font Size: a A A

Research On Cross-Modality Person Re-Identification Algorithm

Posted on:2024-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ZouFull Text:PDF
GTID:2568307127953849Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cross-modality person re-identification is a task of correctly matching pedestrian images in visible light modality and infrared light modality using computer vision methods under disjoint cameras,and is usually regarded as a branch of image retrieval question.It is extremely difficult for cross-modality person re-identification to correctly match the same pedestrian images of different modalities for the huge difference between the pedestrian images of the two modalities.In recent years,with the flourish of cross-modality person re-identification methods,the method using triplet loss function and the method based on multi-branch network structure have shown excellent performance improvement effects,which divides the extracted features into different operations according to the branches.After processing,the features of different branches have the characteristics of information complementarity between features.However,most methods based on multi-branch network structures do not effectively use the information complementarity between different features.At the same time,the traditional triplet loss based on Euclidean distance has high computational complexity and cannot effectively separate the angles of features in a common space.In order to make better use of the information complementarity between different features,this paper conducts research on the multi-branch network structure and improves the traditional triple loss.The main research contents of the paper are as follows:(1)In order to fully utilize the information complementarity between local features of different granularity and global features of pedestrian images in visible and infrared light modality,a Multi-granularity Cross-modality Person Re-identification Network(CM-MGN)is proposed.The network designs a dual-stream network structure based on parameter sharing,does not share the parameters in the last three residual blocks,but shares the parameters in the first two residual blocks of the backbone network,so that the model can extract specific features and shared features of pedestrian images of different modalities,and it can also effectively reduce the amount of parameters of the model and shorten the time for model training.The multi-branch network makes fully utilized of the complementarity of the global features and different granularities local features of pedestrians.By combining the local features and global features of different granularities of pedestrians,more pedestrian feature information is fully utilized.It effectively shortens the distance of the same pedestrian features in different modalities in public space,and improves the network’s ability to extract discriminative features.(2)In order to reduce the computational complexity of the traditional triplet loss and effectively separate the angle of the feature in the public space,a Heterogeneous Center Triplet loss based on Angular distance constraints(HCAT)is proposed.The traditional triplet loss cannot effectively constrain the angle between the feature vectors,which makes it difficult to separate the direction of the feature vectors in the common space,and the abnormal samples encountered during sampling will destroy other good pairwise distances learned,and the sampling method used by the triplet loss requires a large number of pairwise distances to be computed.The heterogeneous center cross-modality triplet loss constrained by angle distance proposed in this paper can effectively measure the angle between feature vectors,so that the model can correctly separate the feature vectors in the common space,and the measurement method based on the heterogeneous center not only can solve the problem of abnormal sample selection in the traditional triple loss,but also can reduce the computational complexity of the model.(3)In order to fully enhance the information interaction between the local feature branch and the global feature branch under cross-modality,this paper proposes a module based on Local Global Feature Interactive(LGFI).Most cross-modality person re-identification methods fail to make good use of the relationship between global features and local features,and ignore the role played by modality-specific information in the feature fusion stage.The proposed module enhances the information representation of local features by embedding global information into local features,at the same time,the global feature fully absorbs the local features that have been enhanced in information expression,which enhanced the expression ability of the global feature.It also gives full play to the role of modality-specific information in the feature fusion stage.Moreover,this method does not require additional annotations or generates third-party modality images for assistance,and consumes less resources.In summary,this paper proposes a total of three methods CM-MGN,HCAT and LGFI to promote the correct rate of the cross-modality person re-identification algorithm.The fully experimental results on two commonly used datasets RegDB and SYSU-MM01 for crossmodality person re-identification confirm the competitive performance of the algorithm proposed in this paper.
Keywords/Search Tags:Heterogeneous Center, Person Re-identification, Angular Distance, Cross Modality, Multi Granularity, Deep learning
PDF Full Text Request
Related items