Research On Convolutional Neural Network Based Algorithm For Person Re-identification

Posted on:2019-09-19

Degree:Doctor

Type:Dissertation

Country:China

Candidate:C Shen

Full Text:PDF

GTID:1368330572488003

Subject:Electronic information technology and instrumentation

Abstract/Summary:

PDF Full Text Request

Person re-identification(re-ID)is an emerging research topic in the field of computer vision.It aims at accurately matching images of a person of interest across multiple disjoint camera views base on the appearance characteristics of pedestrians.Person re-ID technology can be widely used in video surveillance scenarios,such as intelligent security,intelligent transportation,and smart shopping.It has very important scientific research value and practical application significance,so it has received more and more attention from the computer vision community.In recent years,a large number of researchers have introduced convolutional neural network(CNN)-based deep learning algorithms into the person re-ID problem.This kind of deep learning-based methods can learn more robust and more discriminative feature embedding through a "feature extraction+loss function optimization" procedure in an end-to-end fashion.In this way,they partially solve some problems that traditional methods can't solve very well,and achieve a great improvement in person re-ID performance.However,the person re-ID algorithm based on the general CNN framework still faces some difficulties,such as being less sensitive to the subtle local regions with strong discriminative features.Therefore,this dissertation focuses on person re-ID,which is a popular and challenging research topic of significant research and application value,and proposes better feature embedding learning algorithms based on today's popular CNN technology from three different perspectives of employing multi-level similarity perception constraints,utilizing strong neural activations on high-level convolutional layer feature maps and constructing sampling-based sharp attention mechanism.These all three algorithms are around the same core theme-"to make CNNs more significant Focusing on the highly discriminative local detail features",and have important theoretical research significance and engineering practical value.Specifically,the main contents and contributions of the three research works of this dissertation are as follows:Firstly,this dissertation presents a novel person re-ID algorithm based on deep Siamese network architecture and multi-level similarity perception.According to the distinct characteristics of diverse feature maps,different similarity constraints are effectively applied to both low-level and high-level feature maps,during training stage.Due to the introduction of appropriate similarity comparison mechanisms at different levels,the proposed approach can adaptively learn discriminative local and global feature representations respectively,while the former is more sensitive in localizing part-level prominent patterns relevant to re-identifying people across cameras.In addition,the approach has two other benefits.First,a multi-task learning architecture is employed to simultaneously optimize classification and similarity constraints.Multi-task learning framework can impose knowledge sharing while solving multiple correlated tasks,incorporating both of their merits.Second,because the similarity comparison information has been encoded in the learnable parameters of the network,the algorithm does not require the time inefficient procedure of pairwise input at test time.Therefore,compared with the traditional Siamese network-based methods,the algorithm is more efficient and can extract image features to build index in advance,which is essential for large-scale real-world application scenarios.The experimental results on multiple challenging benchmarks show that the method achieves better performance than other state-of-the-art methods at the time.Secondly,this dissertation proposes a person re-ID algorithm for unsupervised extraction and utilization of strong neural activations on the highest level convolutional layer feature maps.Through careful observation and experimental verification,the strong neural activation regions extracted by the algorithm can be used to represent local subtle features with abstract semantic information,and the extraction method is unsupervised and does not need to use additional supervision.Furthermore,a deep feature embedding model simultaneously encoding original global information and discriminative local features is proposed.This feature embedding can effectively enlarge the gap between the inter-class variance and the intra-class variance,thus significantly improving the retrieval performance.This method is not only suitable for person re-ID,but also for a wider range of fine-grained retrieval problems.The experimental results demonstrate that the proposed method is superior to other state-of-the-art methods at the time in both fine-grained retrieval tasks and person re-ID tasks.Finally,this dissertation presents an innovative person re-ID algorithm based on sharp attention mechanism.The sharp attention mechanism can obtain attention masks by adaptively sampling feature maps from CNNs.Due to the introduction of sampling-based attention models,the proposed approach can adaptively generate sharper attention-aware feature masks.This greatly differs from the gating-based attention mechanism that relies soft gating functions to select the relevant features for person re-ID.Soft attention networks usually use the Sigmoid function to smooth the mask values to[0,1].Soft attention masks obtained through this process have large semantic uncertainty.In contrast,the proposed sampling-based attention mechanism allows us to effectively trim irrelevant features by enforcing the resultant feature masks to focus on the most discriminative features(i.e.,the attention mask value is close to either 0 or 1).It can produce sharper attentions that are more assertive in localizing subtle features relevant to re-identifying people across cameras,with no attention ambiguity.For this purpose,a differentiable Gumbel-Softmax sampler is employed to approximate the Bernoulli sampling to train the sharp attention networks in an end-to-end fashion through backpropagation.Extensive experimental evaluations demonstrate the superiority of this new sharp attention model for person re-ID over the baseline and other related methods on several challenging large-scale person re-ID datasets.

Keywords/Search Tags:

person re-identification, fine-grained retrieval, convolutional neural network, multi-level similarity perception, strong neural activations, adaptive sampling, sharp attention

PDF Full Text Request

Related items

1	Analysis And Research Of Key Technologies For Fine-grained Image Recognition Based On Convolutional Neural Networks
2	Research On Depth Visual Attention Method For Multi Class Target Fine-Grained Recognition
3	Research On Video Person Re-identification Algorithm Based On Convolutional Neural Network
4	Research On Video Person Re-identification Based On Two-stream Multi-level Attentive Promotion
5	Research On Multi-attention Mechanism Fusion For Fine-grained Image Classification
6	Research On Fine-grained Image Classification Based On Deep Convolutional Neural Network
7	Research On Pedestrian Fine-grained Recognition And Re-identification Technology
8	The Research Of Fine-grained Sentiment Analysis Of User Reviews Based On Neural Network
9	Research On Fine-grained Image Classification Based On Deep Convolutional Neural Network And Dual-domain Attention Mechanism
10	Fine-grained Image Classification Based On Convolutional Neural Network