A steel surface defect detection method based on comprehensive attention was proposed to improve the detection performance of steel surface defects for problems such as low-contrast between defects and background, large differences in the multiple scales of the intra⁃class defects. 1) Feature extraction was performed based on the convolution and self-attention hybrid modules to obtain feature maps with local detail feature information and long⁃distance pixel dependencies, which helps to enhance the processing ability for changes in shape and size of intra⁃class features, and to improve the robustness of complex background detection. 2) A comprehensive attention structure was proposed, which included a spatial attention module, a channel attention module and a self-attention module. The attention mechanism was fully used to extract the features of current feature maps, highlight defect objects in steel surface images with background noise. The experimental results showed that the performance of the proposed method on the NEU⁃DET and GC10⁃DET datasets were improved, which verified the effectiveness and generalization ability of the method.
NIXuefeng, MAZiji, LIUJianwei, et al. Attention network for rail surface defect detection via consistency of intersection⁃over⁃union(IoU)⁃guided center⁃point estimation[J]. IEEE Transactions on Industrial Informatics, 2022, 18(3): 1694⁃1705.
[2]
GUOMenghao, XUTianxing, LIUJiangjiang, et al. Attention mechanisms in computer vision: A survey[J]. Computational Visual Media, 2022, 8(3): 331⁃368.
[3]
HUJie, SHENLi, ALBANIES, et al. Squeeze⁃and⁃excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011⁃2023.
[4]
CARIONN, MASSAF, SYNNAEVEG, et al. End‑to‑end object detection with transformers[C]//Computer Vision⁃ECCV 2020, Glasgow, 2020: 213⁃229.
[5]
FUJun, LIUJing, TIANHaijie, et al. Dual attention network for scene segmentation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 2019: 3141⁃3149.
[6]
VASWANIA, SHAZEERN, PARMARN, et al. Attention is all you need[EB/OL]. (2017⁃06⁃12) [2017⁃06⁃12].
[7]
TONGZanjia, CHENYuhang, XUZewei, et al. Wise‑IoU: Bounding box regression loss with dynamic focusing mechanism[EB/OL]. (2023⁃01⁃24) [2023⁃01⁃24].
[8]
YUEBiao, WANGYangping, MINYongzhi, et al. Rail surface defect recognition method based on AdaBoost multi⁃classifier combination[C]//2019 Asia‑Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Lanzhou, 2019: 391⁃396.
[9]
HEYu, SONGKechen, MENGQinggang, et al. An end⁃to⁃end steel surface defect detection approach via fusing multiple hierarchical features[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 69(4): 1493⁃1504.
[10]
CHENGXun, YUJianbo. RetinaNet with difference channel attention and adaptively spatial feature fusion for steel surface defect detection[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 1⁃11.
[11]
MNIHV, HEESSN, GRAVESA, et al. Recurrent models of visual attention[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems⁃ Volume 2, Montreal, 2014: 2204⁃2212.
[12]
JADERBERGM, SIMONYANK, ZISSERMANA, et al. Spatial transformer networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems⁃ Volume 2, Montreal, 2015: 2017⁃2025.
[13]
LIUZe, LINYutong, CAOYue, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 2021: 9992⁃10002.
[14]
ZHANGHao, LIFeng, LIUShilong, et al. Dino: Detr with improved denoising anchor boxes for end⁃to⁃end object detection[EB/OL]. (2022⁃07⁃11) [2023⁃01⁃24].
[15]
LIJiashi, XIAXin, LIWei, et al. Next⁃vit: Next generation vision transformer for efficient deployment in realistic industrial scenarios[EB/OL]. (2022⁃08⁃16) [2023⁃01⁃24].
[16]
XiaomingLYU, DUANFajie, JIANGJiajia, et al. Deep metallic surface defect detection: The new benchmark and detection network[J]. Sensors, 2020, 20(6): 1562.
[17]
LINT Y, GOYALP, GIRSHICKR, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318⁃327.
[18]
RENShaoqing, HEKaiming, GIRSHICKR, et al. Faster R⁃CNN: Towards real⁃time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137⁃1149.
[19]
TIANZhi, SHENChunhua, CHENHao, et al. FCOS: Fully convolutional one⁃stage object detection[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 2019: 9626⁃9635.
[20]
LIYanghao, CHENYuntao, WANGNaiyan, et al. Scale‑aware trident networks for object detection[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 2019: 6053⁃6062.
[21]
PANGJiangmiao, CHENKai, SHIJianping, et al. Libra R⁃CNN: Towards balanced learning for object detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 2019: 821‑830.