| In recent years,provinces and cities are vigorously promoting the construction of new smart cities.Law enforcement recorder is a common law enforcement equipment used by government officials,and it is one of the important data sources for constructing intelligent public security.Studying computer vision techniques for processing this kind of data is a necessary step in the construction of new smart cities.Object detection is one of the basic tasks in the field of computer vision.With the rapid development of deep learning,research on object detection algorithm based on deep learning has become a hotspot.Therefore,this paper studies the object detection algorithm based on deep learning in the scene of law enforcement recorder.These existing object detection algorithms based on deep learning can be divided into one-stage algorithms and two-stages algorithms.Refine Det,which relies on two-step cascaded regression and is proposed by Zhang et al.,combines the advantages of two-stage algorithms on the basis of one-stage algorithm Single Shot Multi Box Detector(SSD)and has a good performance on common datasets.Therefore,this paper will conduct further research on the basis of Refine Det.For the convenience of study,this paper builds an object detection dataset which takes person as target using the video shot by the law enforcement recorder.This dataset contains a training dataset of 10671 pictures and a test dataset of 2633 pictures,which can help verify the effects of the study in this paper.Due to the particularity of shooting equipment,way and scene,it is difficult to design an object detector with high performance on this dataset.Multi-scale feature fusion is a common structure used in target detection algorithm.It can improve the representation ability of shallow features by integrating the semantic information of deep features to improve the detection accuracy.Refine Det uses the TCB,which imposes a burden on the number of parameters and detection speed,to achieve multiscale feature fusion.For optimizing the multi-scale feature fusion module,this paper designed three different multi-scale feature fusion modules by combining the channel attention mechanism,depthwise separable convolution and residual block in Res Net.These modules are applied in Refine Det to carry out a comparison experiment on the self-built dataset.The experimental results show that these three modules can improve the performance of the algorithm,which can effectively reduce the model size and improve the detection speed while increasing the detection accuracy.Anchor boxes are a series of candidate boxes with different sizes and aspect ratios generated from each pixel of the detected feature map,which are equivalent to the initial boxes of the prediction boxes.A good anchor boxes design strategy can accelerate the convergence speed of the algorithm.Anchor boxes in Refine Det are originally set for face detection and are not fully applicable to all scenarios and targets.In order to make the algorithm better adapt to the scene and the target of this study,this paper analyzes the self-built dataset focus on the size of the ground truth boxes by using the K-means algorithm,and sets a new anchor box design strategy.Then we conduct an experiment on the self-built dataset,and the experimental results show that this strategy is helpful to accelerate convergence and improve m AP by 0.14%. |