Font Size: a A A

Saliency Learning and Person Re-Identification

Posted on:2016-03-10Degree:Ph.DType:Thesis
University:The Chinese University of Hong Kong (Hong Kong)Candidate:Zhao, RuiFull Text:PDF
GTID:2478390017481269Subject:Electrical engineering
Abstract/Summary:
Saliency estimation, aiming at highlighting visually salient regions, has become a useful tool for various vision tasks, such as context-aware image resizing and object detection / classification. Saliency information also exists, but differently presents in person re-identification, which is a task to match pedestrians across non-overlapping camera views mainly based on their appearance. In this thesis, we exploit the saliency information in matching pedestrian images, and mainly focus on the person re-identification problem. We first provide an overview of various person re-identification systems, and their evaluation on benchmark datasets. We explore saliency detection for general image by a multi-context deep learning algorithm, and further design human saliency particularly for person re-identification. We propose unsupervised learning algorithms to estimate human saliency, and apply this valuable information in matching and identifying pedestrians.;In saliency detection for general image, low-level saliency cues or priors do not produce good enough saliency detection results especially when the salient object presents in a low-contrast background with confusing visual appearance.;This issue raises a serious problem for conventional approaches. In this thesis, we tackle this problem by proposing a multi-context deep learning framework for salient object detection. We employ deep Convolutional Neural Networks to model contrast between objects in images. Global context and local context are both taken into account, and are jointly modeled in a unified multi-context deep learning framework. To provide a better initialization for training the deep neural networks, we investigate different pre-training strategies, and a task-guided pre-training scheme is designed to suit the multi-context modeling for saliency. Furthermore, recently proposed generic deep models in ImageNet Image Classification Challenge are tested, and their effectiveness in saliency detection are investigated. Our approach is extensively evaluated on five public datasets, and experimental results show significant and consistent improvements over the state-of-the-art methods.;In person re-identification, a person in different camera views often undergoes significant variations on viewpoints, poses, appearance and illuminations, which usually make intra-personal variations even larger than interpersonal variations. Furthermore, background clutters and occlusions also add to the difficulty. Human eyes can recognize person identities based on small salient regions, i.e. human saliency is distinctive and reliable in pedestrian matching across disjoint camera views. However, such valuable information is often hidden when computing similarities of pedestrian images with existing approaches. Inspired by our user study result of human perception on human saliency, we propose a novel perspective for person re-identification based on learning human saliency and matching saliency distribution. The proposed saliency learning and matching framework consists of four steps: (1) To handle misalignment caused by drastic viewpoint change and pose variations, we apply adjacency constrained patch matching to build dense correspondence between image pairs. (2) We propose two alternative methods, i.e. K-Nearest Neighbors and One-class SVM, to estimate a saliency score for each image patch, through which distinctive features stand out without using identity labels in the training procedure. (3) saliency matching is proposed based on patch matching. Matching patches with inconsistent saliency brings penalty, and images of the same identity are recognized by minimizing the saliency matching cost. (4) Furthermore, saliency matching is tightly integrated with patch matching in a unified structural RankSVM learning framework.;Inspired by the previous discovery of human saliency, we also design a scheme to discover discriminative patches. Specifically, we propose a novel approach of learning mid-level filters from automatically discovered patch clusters for person re-identification. It is well motivated by our study on what are good filters for person re-identification. Our mid-level filters are discriminatively learned for identifying specific visual patterns and distinguishing persons, and have good cross-view invariance. First, local patches are qualitatively measured and classified with their discriminative power. Discriminative and representative patches are collected for filter learning. Second, patch clusters with coherent appearance are obtained by pruning hierarchical clustering trees, and a simple but effective cross-view training strategy is proposed to learn filters that are view-invariant and discriminative. Third, filter responses are integrated with patch matching scores in RankSVM training.;The effectiveness of our person re-identification approaches are validated on the VIPeR dataset and the CUHK01 dataset. Our approach by saliency learning and matching outperforms the state-of-the-art person re-identification methods on both datasets. The learned mid-level features are complementary to existing handcrafted low-level features, and improve the best Rank-1 matching rate on the VIPeR dataset by 14%.
Keywords/Search Tags:Saliency, Person re-identification, Matching, Multi-context deep learning, Salient
Related items