Nowadays,the number of surveillance video is increasing day by day.It is difficult to find the target only by analyzing a large number of surveillance video images manually,but intelligent video image target detection and recognition has become a key research direction.Recognizing specific pedestrian from multiple surveillance video images simultaneously is another difficult problem in urban safety control,which is timeconsuming and laborious to rely on manual recognition.Therefore,this thesis designs a target detection network model based on YOLOv5 and a pedestrian re-recognition method based on feature fusion combined with metric learning from two aspects of detecting targets and recognizing different pedestrians in video images.The accuracy of target detection as well as the speed and accuracy of pedestrian re-recognition are improved effectively by using pedestrian images taken in real life.The main research contents of this thesis are as follows:1.The YOLOv5 target detection network model is improved to solve the problems of low accuracy of YOLOv5 and poor detection effect of distant targets in video images.Firstly,the convolutional block attention module(CBAM)is added to the YOLOv5 backbone network,which forms the CBAM_YOLOv5 model.which enhances the ability of the YOLOv5 network to extract useful image features without increasing model parameters.Secondly,the CIo U loss function with higher fitting effect,is used at the head detection end of the network to replace the GIo U loss function in the original YOLOv5 network.Thereby,it improves the accuracy of the YOLOv5 target detection algorithm.Finally,the improved YOLOv5 network model is trained and tested on public datasets and self-made datasets.The results show that the CBAM_YOLOv5 network model effectively improves the detection accuracy.2.Through adopting feature fusion combined with metric learning method,it achieves pedestrian re-identification.Firstly,the pedestrian images detected by the improved YOLOv5 network model are added to the pedestrian reidentification data set to extract multiple features of pedestrian images.Secondly,without reducing the feature expression ability,there are improved and optimized by the extracted HOG features and LBP features,which are fused with HSV features.Thereby,it reduces the dimension of the fused features.Finally,the extracted fusion features are combined with XQDA metric learning method to calculate the similarity between pedestrians.The proposed method is validated on public data sets and self-made data sets,and the results show that the method in this thesis accelerates the recognition speed without reducing the recognition accuracy. |