| With the improvement of the intelligence level of substation safety production control,computer vision technology has been gradually applied to the detection of substation operators’ protective equipment.However,the size of the image dataset of the protective equipment for substation safety workers is small and the manual annotation workload is huge and costly.For this reason,this paper conducts a study on the automatic annotation method for the image dataset of the protective equipment for substation personnal safety operations.The main work is as follows:Firstly,the existing feature extraction methods based on convolutional networks fail to effectively remove the background feature information in extracting substation personnal protective equipment image feature information,resulting in the extracted feature information containing too much background feature information.In this work,we propose the DAR50(Deformable and Attention Residual with 50 layers)feature extraction network,which modifies the conventional convolution in the residual structure of Res Net network into deformable convolution to make the network adapts to the target form to obtain feature information,and then adds sc SE(Spatial-Channel Sequeeze & Excitation)attention mechanism to enhance the network’s ability to acquire more target feature information.After validation in the same experimental environment,the performance of the DAR50 network proposed in this project is improved by 0.5% over the optimal network of existing feature extraction networks,and it can be observed from the visualization of the labeling results that DAR50 can effectively reduce the interference of background feature information.Secondly,for the problem that there are staff wearing protective gears in the acquired images with different scale target states due to the distance from the machine.In this project,we propose a CCFPN(Criss-Cross Feature Pyramid Network)feature fusion network,which first adopts a criss-cross approach to fuse the feature mappings at each stage of the feature extraction network at the pixel level,and then uses upsampling to fuse the feature mappings at different scales again after the fusion of information at different stages,thus increasing the fusion of feature information between targets.Then,we use up-sampling to fuse the feature mappings at different scales to increase the network’s ability to perceive the feature information of the target at different scales.After verification in the same experimental environment,the performance of the proposed CCFPN network is improved by 2.5% over the optimal network in the existing feature fusion network based on the feature pyramid structure for detecting small target objects.Finally,the AMS RCNN detection network model is proposed in this paper to solve the problem of automatic labeling of images of protective items for substation operators.The network model first replaces the feature extraction network part of the existing two-stage object detection algorithm Faster RCNN using DAR50,and then adds the feature fusion module CCFPN at the end of feature extraction to realize the automatic labeling work of the image training data of substation protection objects.To address the problems of large sampling error of feature map and too many overlapping targets in detection results that appear in the original model,this paper improves the post-processing part of the original model by using bilinear interpolation for sampling and weighting function for detection frame screening.Experiments prove that the proposed method improves the comprehensive performance by 0.3% and the performance of detecting small-scale target objects by 1% relative to the existing methods,and improves the accuracy of labeling images of substation protection items. |