Font Size: a A A

Video Object Matching Across Multiple Non-Overlapping Camera Views Based On Deep Learning And Their System Implementation

Posted on:2016-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:X S HeFull Text:PDF
GTID:2308330482467308Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Matching objects across multiple cameras with non-overlapping views is a necessary but difficult task in the wide area video surveillance. The widely used matching method is mainly based on appearance feature of the object. The effectiveness of the feature directly affects the accuracy of the matching results. The success of CNNs in feature extracting is attributed to their ability to learn rich midlevel image representations as opposed to hand-designed low-level features used in other image classification methods. The learning of CNNs requires to estimate millions of parameters and a very large number of annotated image samples for training, which prevents the applications of CNNs to deal with problems only having limited training data. Moreover, the initialization of the network weights can be capricious and may not represent the characteristics of the training samples which will impact the network structure to some extent. Therefore, this paper proposes two novel frameworks of deep neural networks. In the first framework, we efficiently transfer the model learned with CNNs on large-scale annotated datasets to other visual recognition tasks with limited amount of training data. We design a method to reuse layers pre-trained on the ImageNet dataset to compute mid-level image representation for recognition images in the CAVIAR dataset using the Caffe framework. The experiments show that although the two datasets are totally different in image statistics and tasks, the transferred model significantly improves the performance of object matching. Second, we propose a new deep network structure named LPP-DL. Instead of randomly initializing convolution kernels for extracting feature mapping, we use local projection algorithm for extracting the filters from training samples. The experimental results on standard dataset and our own pedestrian ZJGSU01 dataset demonstrate that the advantages of the proposed model in terms of computational efficiency, computation storage, and matching accuracy over that of other state-of-the-art classification-based matching approaches...
Keywords/Search Tags:video object matching, multi-camera views monitoring system, deep learning, deep convolutional neural networks, transfer learning
PDF Full Text Request
Related items