| Person search aims to simultaneously locate and identify the queried person from real,uncropped images,which has great significance in the fields of video surveillance and security.Person search is different from person re-identification.The input requirement of person re-identification is a foreground image that must be cropped.The person search also includes the process of person detection,which is closer to practical applications,and the entire surveillance image can be used as input instead of a cropped partial image.At present,there are two ways to implement person search: end-to-end person search method and separation method.(1)At present,the end-to-end network is becoming more and more complex,and most of the re-identification modules are added to the two-stage target detection network.Since the prediction frame output by the region proposal network is relatively rough,it is not suitable for re-identification tasks,so the accuracy of this method is low.In view of the above deficiencies,based on the anchor-free detection network,this paper adds attention mechanism and deformable convolution to the backbone network,so as to enhance the feature extraction ability of the backbone network and improve the accuracy rate.Second,this paper improves the loss function,and the improved loss function matches the anchorfree network,further improving the accuracy.In addition,because there are a large number of fuzzy samples in the public PRW dataset,the evaluation indicators are not accurate enough,so this paper ignores fuzzy samples during testing.Compared with the baseline m AP on the PRW and CUHK-SYSU datasets,this paper has increased by 2.41 and 0.87 percentage points,respectively,and rank-1 has increased by 1.60 and 0.41 percentage points,respectively.(2)Embedded devices have stricter requirements on the size of the model and network speed,and need to implement lightweight improvements to the existing network.This paper proposes a lightweight detector and a lightweight re-identification network,and cascades the two to realize a separate person search network.For the task of person detection,this paper first makes a lightweight improvement on the backbone network,and then implements channel pruning on the head of the network to further reduce parameters.After combining the two improvements,the accuracy of the object detection network drops slightly,but the model volume becomes 15.97% of the original,and the inference time consumption becomes56.5% of the original.For the person re-identification task,this paper adds improved attention mechanism,depthwise separable convolution and auxiliary prediction head to the backbone network.The space complexity and time complexity of the improved network are significantly reduced,the volume is reduced by about 41%,and the reasoning time is reduced by 18%.The m AP and rank-1 of the pedestrian search network after the combination of the two parts are improved by 3.3 and 1.0,respectively.(3)Aiming at the practical requirements of the AI embedded platform,this paper designs a person search system based on the Hisilicon AI solution.The tasks of model conversion,model quantization,algorithm simulation and board test are realized.When searching for persons,the video frame is first detected for persons,and then cropped according to the detection results,and then the cropped foreground image features are extracted,and finally the similarity with the query features is calculated.If the similarity is greater than the threshold,the target is considered to be found.The target detection time of the system is reduced from 31.55 ms to 7.325 ms,and the model size is reduced from 7827 kb to 1848 kb.The re-identification network time-consuming decreased from 3.3ms to 2.83 ms,and the volume decreased from 11297 kb to 7882 kb.After testing,the overall speed of the system can reach 30 FPS,which has high practical value.The system provides a new solution for intelligent monitoring applications. |