Object detection is one of the most important downstream tasks in the field of computer vision and is widely used in object tracking,industrial defect detection,and other fields.With the rapid development of deep learning,more and more detectors are being developed with very significant detection performance.As a result,it is expected that machines can use computer vision technology to replace traditional manual work,especially defect detection in industrial production,thus helping to reduce costs and increase efficiency in related industries.In the field of industrial defect detection,most of the state-of-the-art detection algorithms are anchor-baesd object detector structures.However,the detection accuracy of such detectors is very sensitive to the setting of anchor hyperparameters.For different detection tasks,the corresponding anchor hyperparameters need to be redesigned,and the generalization ability is poor.In addition,most deep learning-based object detectors currently include multi-scale fea-ture pyramid structures,which extract features of different scales and perform multi-scale fea-ture fusion on the extracted features to help the subsequent network better complete prediction.However,most current multi-scale feature pyramid structures have two potential problems:the loss of feature information in the channel unification process and the aliasing effect in cross-scale fusion.The existence of these problems greatly limits the effect of multi-scale feature fusion in the feature pyramid structure,thus affecting the final detection effect of the detector.In the field of industrial defect detection,the impact of these problems is more serious and will greatly reduce the detection ability of the detector,especially the aliasing effect in cross-scale fusion.In view of the above problems,this paper proposes a series of solutions and verifies their effectiveness through experiments.The main research achievements of this paper are as follows:(1)We design and propose an anchor-free detector based on an improved Swin Trans-former structure called Swin-Auto Assign.Compared with anchor-based object detectors,Swin-Auto Assign does not rely on prior knowledge based on artificially statistical anchors,and has stronger generalization ability.Compared with other anchor-free object detectors,Swin-Auto Assign has better detection performance,and its detection ability is still competitive com-pared with the existing most advanced anchor-based detectors.In addition,the automatic label assignment strategy in Swin-Auto Assign can better help the detector to detect tiny objects and slender objects(these objects often appear in industrial defect detection scenarios).(2)We design and propose a Channel-Space Adaptive Enhancement Feature Pyramid Net-work called CA-FPN.CA-FPN consists of two core components,namely the Channel-Space En-hancement Module(CSM)and the Adaptive Cross-Scale Pixel-Wise Guidance Module(AGM).CSM transfers features from the channel dimension to the spatial dimension,thus avoiding the feature loss during the channel unification process.AGM solves the aliasing effect in cross-scale feature fusion by controlling the adaptive fusion of corresponding pixels between different scale features.Experiments show that CA-FPN can bring significant performance improvement to the detector,especially for small and large size objects.Finally,combined with Swin-Auto Assign,we further propose an efficient detector suitable for industrial defect detection,CA-Auto Assign,and compare it with the state-of-the-art detectors on the Alibaba Cloud Tianchi Fabric Dataset and NEU-DET.The experimental results show that CA-Auto Assign has the best detection per-formance,and50reached 89.1 and 82.7 respectively.(3)We design and propose a compression framework for Swin Transformer called Mini-Swin.In order to reduce the memory requirements of embedded devices during model deploy-ment,we studies the weight sharing method of Transformer in object detection.We use the CKA algorithm to select out the modules that need weight sharing in Swin Transformer,thus avoiding the problem of unstable training process.In addition,in order to alleviate the problem of severe degradation of detector performance caused by sharing weights,we propose a module-based weight multiplexing method.Finally,on the Alibaba Cloud Tianchi Fabric Dataset and NEU-DET,we verify the effectiveness of Mini-Swin.Compared with CA-Auto Assign before model compression,the CA-Auto Assign using Mini-Swin reduces the parameter amount and weight file size by 20.78%,while the detection accuracy only decreases by 2.24%and 2.06%respectively. |