| China’s increasing car ownership is putting pressure on intelligent city systems,causing traffic congestion and creating travel hazards.Video and image information is an important source of data in today’s intelligent city systems,and most video and image information come from fixed road cameras that are not flexible enough,especially when responding to unforeseen situations.With the increasing affordability of small rotary-wing drones,the use of drones for road information collection is now possible.Drones with filming equipment are already being used in traffic control because of their high flexibility,wide viewing angles and easily met take-off conditions.For example,traffic police use drones to monitor violations in high-traffic areas,and drones are used to patrol and broadcast during epidemics.In traditional target detection algorithms,features need to be set manually based on experience,and the more features that match the data the better the results.Manual features are poorly migratory,and it is difficult to design robust features nowadays with the increasing volume of data.Deep learning-based target detection algorithms are broadly divided into two types: one-stage and two-stage.The two-stage algorithm is highly accurate,but the detection speed is slow and does not satisfy real-time;the one-stage algorithm is fast,but generally not ideal for small targets and the overall detection accuracy is insufficient.In response to the above problems,the article uses single-stage YOLOv4 as the basic network for improvement.the YOLO series algorithms use the idea of anchor points and require a priori frame pre-sets,and the pre-set values will directly affect the effectiveness of the model if they do not match the characteristics of the data.Data is the basis of deep learning research,and this paper addresses the dataset for the main research object,the car.In this paper,we focus on the main object of study,the car,to obtain a dataset that better matches the actual application according to the research problem,and to improve the algorithm in both speed and accuracy dimensions.We selected six types of car targets in the VEDAI dataset for re-labelling and then clustered the new dataset using the k-means++algorithm,expecting the network to obtain pre-selected frames that are more relevant to the real-world application,speeding up the model and allowing for better convergence of candidate frames.The network structure of YOLOv4 for multi-scale target detection is improved to address the problems of unsatisfactory detection of small targets and inaccurate recognition of categories in the first stage of the algorithm.In this paper,the network structure of YOLOv4 is modified by using the improved Dense Net-121 as the network backbone and hswish as the activation function to improve the feature extraction and recognition capability of the dataset,while the regression frame filtering algorithm is improved by using the adaptive NMS algorithm as the regression frame filtering algorithm of this model.The experiments show that the accuracy can be improved by modifying the network structure,re-clustering the dataset and improving the regression frame filtering algorithm,and the overall recognition effect is good,while ensuring real-time performance. |