| Today,the video surveillance system has become an important part of the security system.Information analysis in traditional video surveillance mainly depends on manpower,which is not only costly and inefficient,but also prone to omissions and misjudgments.In recent years,the object detection technology based on deep learning has been developed rapidly,and has shown close to or even exceed human performance in many visual detection tasks.Intelligent video surveillance system combined with the object detection technology based on deep learning can free people from the cumbersome tasks of traditional human video analysis,and improve the monitoring effect,so as to ensure social public safety.Therefore,the research on intelligent video surveillance technology combined with deep learning is very meaningful.Although the object detection technology has made great progress in recent years,there are still many unresolved problems.Object detectors are typically trained based on fullylabeled training data.The acquisition of training datasets is a major difficulty,and manual labeling of datasets is very time-consuming and labor-intensive.Public datasets can be used instead.However,the labels of the existing single public dataset may not fully cover all the target classes required by the detection scenarios,which may require combining multiple datasets to complete deep learning model training.Plain joint training leads to heavy biases and poor performance due to the different label spaces of each dataset.In addition,in the object detection task,small target objects have poor detection results because of their low resolution and less feature information.To solve these problems,the main research work and contributions of this thesis are as follows:1.This thesis uses the pseudo-label method to train object detection models with a unified label space.In this thesis,a multi-model pseudo-label method is proposed to mine various pseudo-labels to improve the missing label information of multiple datasets.Aiming at the problem of poor quality of pseudo-labels,this thesis proposes a re-labeling method based on influence function to identify and re-label harmful labels in pseudo-labels.Experiments show that,compared with the object detectors trained by the pseudo-label generated by the algorithm in the previous study,the object detector trained by the algorithm proposed in this thesis has higher accuracy,and can achieve similar detection accuracy with the object detector trained by manually labeled data.The accuracy of object detector proves the effectiveness and superiority of the algorithm proposed in this thesis.2.The problem of small object detection is optimized on the benchmark network YOLOv5.In view of the problem of less feature information of small objects,this thesis adds a higher resolution feature layer as the prediction branch of small objects.The Bi FPN structure is used for adaptive feature fusion to enhance the effect of feature fusion.In addition,combined with the application scenario of video surveillance,it is proposed to use the difference information between video frames as spatial attention to improve the positioning ability of the network.Finally,experiments show that the improvement proposed in this thesis can improve the detection ability of the model for small objects.3.This thesis analyzes the requirements of power room security monitoring system,and completes the training of object detection model combined with the algorithm proposed in this thesis.This thesis also optimizes the object detector and deploys it to Jetson Xavier NX,and then develops a visual interface,and finally completes the power room security intelligent monitoring system based on deep learning object detection. |