| With the rapid development of manufacturing capabilities and the general growth in demand for quality,quality inspection is playing an increasingly important role in industrial production.Manual testing is being phased out due to inefficiencies,the impact of subjective factors and high labour costs.Convolutional neural network(CNN)based inspection algorithms have better detection results than machine vision algorithms due to their powerful representational capabilities.Therefore,this paper investigates surface defect detection algorithms based on CNN.General object detectors such as the YOLO series have been more widely used in industry than other detection methods due to its simple and effective design.Different from natural objects,detailed information of complex defects largely determines the final classification and location results.However,the general object detector mainly models the structured natural objects,which leads to its poor performance in the task of defect detection.Some typical defect detection tasks like fabric defect detection have inherent difficulties,such as complex backgrounds and extreme scale ratios.More importantly,there is also a trade-off between accuracy and speed that needs to be considered in practical applications.In view of these considerations,this paper investigates the representative Scaled-YOLOv4 and YOLOv5 of the YOLO family and improves them to be suitable for defect detection in industrial production scenarios.The main research work in this paper is summarized as follows.(1)The fundamental theory and network structure of YOLOv4,Scaled-YOLOv4 and YOLOv5 are investigated in depth and some of the structures are modified for adaptation.A pre-analysis based on the Tianchi fabric dataset summarises some typical problems in the detection of surface defects in industrial products.Next,the architectural design of general object detector is associated with the characteristics of industrial scenarios in order to further explore the shortcomings of object detectors in defect detection tasks.Finally,an extensive survey of existing machine learning and deep learning algorithms is conducted to propose essential factors that need to be considered in the design of detector architectures in industrial scenarios,such as the dynamic adjustment of receptive field,multi-scale feature alignment and cross-scale feature interaction.(2)For the problems of complex background interference and extreme defect scales in fabric images,this paper proposes an Efficient Scale-aware Network(ES-Net)based on ScaledYOLOv4.Firstly,a strong baseline is constructed based on Scaled-YOLOv4 to fully exploit the performance of the detector.Then,for the problem of complex background interference,an Aggregated Feature Guided Module(AFGM)is proposed to aggregate global multi-scale features and dynamically guide updates at each scale.Noting the mismatch between the receptive field and target scales in the detector,a Dynamic Scale-aware Head(DSH)is proposed to further enhance the detection capability of the detector at different scales.Finally,in order to balance the efficiency loss caused by the aforementioned modules,the original PANet is redesigned into a more Efficient Stair Pyramid(ESP)with fewer fusion nodes for a higher rate of utilization.Experimental results based on several publicly available defect datasets show that ES-Net has significant advantages over other algorithms in all metrics.The detection accuracy on the Tianchi fabric dataset is improved by 8.2% compared to Scaled-YOLOv4,while the parameter size is reduced by 52.4MB and the speed is improved by 2FPS.(3)Based on the ES-Net work,in-depth analysis and improvements based on the YOLOv5 baseline are carried out with the aim of improving accuracy while bringing less loss of efficiency.After analysis,the inability to extract sufficient detail information result in a general object detector that could not achieve the desired detection results in defect detection tasks.Therefore,the Aligned Dense Feature Pyramid Network(AD-FPN)is designed to minimize the loss of detail that exists in feature fusion.In addition,an Adaptive Feature Purification Module(AFPM)with explicit supervision is designed to guide the network to locate defective regions and filter background information in a more intuitive way.Finally,based on the aforementioned AFGM,a Phase-wise Feature Redistribution Module(PFRM)is designed to adaptively assign integrated features according to the semantic level of the different layers.Combining the three components with YOLOv5,the obtained CS-YOLO achieves state-of-the-art results on various datasets. |