Font Size: a A A

Research On Crowded Pedestrian Detection Based On Positive Sample Allocation And Lightweight Design

Posted on:2024-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q ShuaiFull Text:PDF
GTID:2568307133456594Subject:Master of Mechanical Engineering (Professional Degree)
Abstract/Summary:PDF Full Text Request
Pedestrian detection is a fundamental and critical research task in the field of computer vision.Its goal is to process images through visual-related techniques to accurately locate and recognize the corresponding target population.With the gradual maturity of deep learning technology and computer hardware application in China,perfect pedestrian detection platforms have been built in many industrial fields,such as traffic flow statistics,intelligent security at ports,and unmanned driving perception platforms.However,there are still many challenges in the practical application of pedestrian detection.For example,uneven lighting and interference from complex backgrounds can affect the performance of the detector.When the population density in the detection image is too high,it can lead to severe occlusion of the target population,resulting in missed detections.In addition,due to the camera’s placement,pedestrians in the image may have small pixels,making it difficult for the detector to extract features.Currently,conventional object detectors can handle general localization and recognition tasks,but their robustness is low in crowded pedestrian scenes.On the other hand,due to production costs and detection time constraints,lightweight design of algorithm models is also a task of great concern.This article mainly focuses on the difficulties of feature extraction of small targets and occlusion between target individuals,as well as research on considering the model’s inference speed without compromising model accuracy.The main research work of this article includes the following three aspects:(1)To address the problem of too many small targets in crowded pedestrian scenes,this article improves the mosaic line data augmentation and increases the diversity of training images.An efficient convolution block attention mechanism module is designed to quickly search for local channel information and enhance the model’s attention to local features.The prediction decoupling head of the algorithm model is redesigned.Conventional pedestrian detectors use a 1×1 convolution kernel as the prediction decoupling head.This article uses a hierarchical structure to separate the prediction of classification confidence and object box regression.It decodes the offset of the regression box branch prediction box,maps and extracts high-confidence predicted boxes,and fuses them with the feature map of the classification prediction branch to improve the confidence of local effective features.(2)To address the situation of semi-occlusion and almost complete occlusion of closely packed individuals,the training model of the conventional detector cannot accurately match high-quality positive samples for loss calculation.This article redesigns the model’s training method to dynamically match high-quality positive samples for small targets and occluded targets,enabling the model to learn more effectively through expanded high-quality samples,and improving the detection ability of small targets and occluded targets.In post-processing,the Soft-NMS algorithm is used to attenuate the confidence of predicted boxes rather than directly suppress them.Traditional nonmaximum suppression algorithms cannot accurately handle occlusion between target objects.If the overlap between two targets is too large,the non-maximum suppression algorithm will suppress one of the predicted boxes,leading to missed detections,especially in crowded pedestrian detection scenes.(3)This article designs two algorithm models of different parameter scale based on YOLOv5 m and YOLOv5 s.Deep neural networks have high advantages in detection accuracy,but their inference speed is difficult to ensure real-time rate on some lightweight devices,and the large parameter size also limits their application.This article utilizes knowledge distillation and designed distillation strategies to enable shallow networks to achieve the detection capabilities of deep networks,thereby replacing deep networks to achieve lightweight goals.Furthermore,by using structural reparameterization,the model parameters and network layers are compressed at the model inference stage to achieve accelerated inference.The ONNX Runtime acceleration library is used to build a C++inference platform,which implements functions such as pedestrian statistics and traffic congestion detection,and collects real-scene data for detection and analysis.The above three main research works were extensively validated on the Crowd Human dataset and VisDrone2019 dataset.The accuracy of the algorithm model designed in this paper is 53.8% on the test set of Crowd Human,and the inference speed of the model can reach 7.1ms.Through ablation experiments and model comparison analysis,the effectiveness of the algorithm designed in this article was demonstrated in achieving a balance between detection speed and accuracy.
Keywords/Search Tags:crowded pedestrians, small target detection, positive sample allocation, knowledge distillation
PDF Full Text Request
Related items