| Since the launch of safe city plan in China,people have become more and more concerned about their personal and property security in public places.Using a large number of video surveillance cameras to quickly and accurately obtain portrait information in public places is of great significance to the management of urban public order and criminal investigation work.With the increasing scale and number of camera networks,the way to manually retrieve a specific person from different surveillance videos is increasingly difficult to meet the needs of urban security managers and public security criminal investigations due to inefficiency and high cost.Therefore,relying on computer vision technology to automatically and accurately retrieve specific person from the surveillance camera network has become an urgent need.This paper studies the person detection and person re-identification methods in cross-camera person retrieval.Design and implement a cross-camera person retrieval system based on the research of above two methods.Specifically,this paper has done the following:Firstly,in the study of person detection methods,the characteristics and shortcomings of the main popular person detection method used in pedestrian search are analyzed for the pedestrian multi-scale problem of actual surveillance scenarios.A person detection model with feature space pyramid structure is proposed to capture pedestrians of different scales,and the effectiveness and advantages of the structure are verified by experiments..For the problem of lack of practical guidance on the optimization strategy for person detection and re-identification in actual retrieval scenarios,the joint and phased optimization methods are compared base on a unified model and dataset.The combination of better performance in the actual scenario is confirmed and applied in the subsequent system implementation.Then,in the study of person re-identification method,based on the characteristics of human recognition of pedestrians in different cameras,we propose a neural network model with coarse and fine granularity which is a multi-branch network structure with two sub-branches of coarse and fine granularity.They are guided by different levels of human semantic information to extract features of different granularity.And by analyzing the shortcomings of using classification loss function in current re-identification method,this paper draws on the idea of knowledge distillation,introduces the knowledge distillation loss function to optimize the training and feature extraction of the network.The validity and superiority of the proposed methods are verified on the public benchmark datasets.Finally,according to the results of the first two researches,the person detection and re-identification methods are integrated to design and implement a person retrieval system that is more suitable for practical application scenarios.By inputting a person image of interest under a certain camera to the system,the user can quickly retrieve the presence of the person from the video data of the video surveillance camera set during the specified time period.Compared with traditional manual retrieval methods,it can save a lot of time and manpower. |