Font Size: a A A

Multi-modal Person Re-Identification Based On Fine-grained Feature Fusion

Posted on:2022-11-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:G S XieFull Text:PDF
GTID:1488306746976809Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Person re-identification(re-id)aims to match person images in real-time or nonreal-time surveillance systems using computer vision technology.In terms of implementation,it is widely considered as a sub-problem of image classification or retrieval.As a core component for most machine vision solutions such as cross-camera pedestrian trajectory tracking and pedestrian behavior recognition,person re-id based on deep learning has shown broad application prospects in community security,criminal investigation,traffic violation correction,public safety,and etc.In an actual deployment scenario,due to the influences from external factors,such as occlusion,background clutter,pose variations,and illumination invariance,person re-id is still a challenging topic in computer vision.Most of the existing person re-id methods depend on image visual information classification,lacking further mining and full utilization of semantic information.On the other hand,the interaction of multimodal information are also easily overlooked or underestimated.This paper focuses on some critical issues of person re-id,where the corresponding algorithm and network framework are proposed in exploring the potential of relation information,color-robust deep learning,and the method based on a fine-grained feature fusion model.The main contributions of this paper are as follows:1.The thesis proposes a relationship information mining method based on localglobal association.Other than focusing on gaining part-level image details followed by retrieving,this paper infers relationship information by mining the hidden associations of different semantic partitions.By jointly exploiting the potential relationship information between multi-granularity characteristics from semantic regions,we stacked the global feature with a holistic view and the partlevel ones with physiological representation.And the pair-wise association features of multi-granularity are manually constructed for further mining the relationship features from the paired ones.This method makes a driving model through local image context information and "local-global" features,infers the correlation information between different granularity features,and further excavates the deepseated relationship information important to classification.It overcomes the shortcomings of traditional re-id methods based on feature building.Supplemented by max-pooling to enhance the characterization details,the deep learning model can alleviate the misalignment problem caused by the change of human pose variations or other factors to a certain extent.2.Aiming at the problem that deep learning-based re-id models are over-reliant on color representation,this section proposes a color-robust feature fusion network.It enhances the ability of the multi-branch network framework in learning colorrobust feature based on relationship information perception.Firstly,this section introduces a channel-controlled non-local attention network framework.The paired training data composed of original image samples and the ones processed through channel adjustment are constructed as input.The color disturbance information is added artificially into the input to improve the ability in learning color-robust in a supervised mode.Such method can also be considered as a way of training data enhancement by expanding the data through random adjustment of color channel or channel transferring.The proposed method in this section can partially simulate the difference in color preferences of pedestrians under change of illumination and views across different cameras,to effectively increase the robustness of final representations to color characteristics and enhance the generalization ability of the deep learning-based person re-id model.3.For the occluded case of person re-id,this section proposed a pose?guided feature region-based fusion network(PFRFN)that gains better image representation through feature fusion on multi-granularity.By fusing the features with pose estimation information and the local ones from the original image with multigranularity,the model helps draw focus more on recovering image areas for deepseated human structure information.Through introducing an external supervision signal,PFRFN cooperatively considers the explicit association of two network branches in the shared image area,and further carries out relationship information reasoning and feature learning on human body image regions.Experiments show that our method achieves competitive performance on occluded and non-occluded cases of person re-id datasets.4.Considered the difficulty in training the attention-based person re-id model and also the lacking of structural constraints,the section proposes a pose-guided semisupervised learning framework.The highly responsive areas of human body regions are labeled by features with pose estimation information,where the collaborative learning and expression of deep semantic information and attention features in the unified deep learning model are realized through a teacher-student dual network supervision mode.Through supervised-information-driven attention clustering,the learned model can infer the significant image representation law according to the existing semantic features,improving the ability of learning the discriminative representation of the attention-based model.
Keywords/Search Tags:Person re-identification, Fine granularity, Multi-modal, Feature fusion, Color robust
PDF Full Text Request
Related items