广东工业大学学报 ›› 2024, Vol. 41 ›› Issue (02): 93-100.doi: 10.12052/gdutxb.230027

• 计算机科学与技术 • 上一篇    

SR-Det:面向工业场景下细长和旋转目标的鲁棒检测

何森柏, 程良伦, 黄国恒, 伍志超, 叶颂航   

  1. 广东工业大学 计算机学院, 广东 广州 510006
  • 收稿日期:2023-02-21 发布日期:2024-04-23
  • 通信作者: 黄国恒(1985-),男,副教授,博士,主要方向为人工智能与模式识别,E-mail:kevinwong@gdut.edu.cn
  • 作者简介:何森柏(1996-),男,硕士研究生,主要研究方向为旋转目标检测,E-mail:3595711069@qq.com
  • 基金资助:
    佛山市重点领域科技攻关资助项目 (2020001006832)

SR-Det:Towards Robust Detection of Slender and Rotated Objects in Industrial Scene

He Sen-bai, Cheng Liang-lun, Huang Guo-heng, Wu Zhi-chao, Ye Song-hang   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2023-02-21 Published:2024-04-23

摘要: 目标检测广泛应用于工业领域,譬如缺陷检测。然而,在检测过程中依然存在任意旋转和大宽高比问题。一是水平锚框方法难以准确地定位物体;二是卷积神经网络 (Convolutional Neural Networks,CNNs) 在提取特征时表现不佳;三是普通的损失函数对细长的目标不敏感。针对上述问题,本文研究了SR-Det (Slender and Rotated Detecto) 模型,包含以下3个部分。首先是旋转区域校准 (Rotated Region Calibration,RRC) 模块。该算法以不同大小和宽高比的水平提议作为输入,以相应的旋转提议作为输出。然后是旋转角度提议对齐模块 (Rotated Angle Proposal Align,RAP-Align) 来保证特征信息的质量。最后是基于交并比 (Intersection Over Union,IoU) 策略的R-IoU函数 (Rotated Intersection Over Union) 以指导模型最大化预测框和GT (Ground Truth) 框之间的重叠面积。实验证明,本文提出的方法在金属罐数据集和幕墙数据集上取得了最优的效果,证明了该方法的有效性。

关键词: 目标检测, 损失函数, 旋转不变性

Abstract: Though object detection has been widely used in the industrial scene, it still faces the detection problems of crack defects with slender and rotated characteristics. On the one hand, traditional horizontal anchor methods are usually hard to precisely locate the object. On the other hand, CNNs (Convolutional Neural Networks) perform poorly in terms of feature extraction from rotated objects. In addition, normal loss functions are insensitive to slender objects. To address these, this paper proposes a Slender and Rotated Detector (SR-Det) for robust slender and rotated object detection. Specifically, the Rotated Region Calibration (RRC) is designed, which takes horizontal proposals with different scales and aspect ratios as inputs and outputs the corresponding rotation proposals. Then, the Rotated Angle Proposal Align (RAP-Align) is presented to guarantee the quality of extracted feature information. Finally, the Rotated intersection over union(R-IoU) based on Intersection Over Union (IoU) strategy is proposed for guiding the model to maximize the area between predicted box and Ground Truth box. The experiments on metal cans and curtain walls datasets have shown that the method proposed achieves state-of-the-art performance, demonstrating the effectiveness of the proposed algorithm.

Key words: object detection, loss function, rotation invariance

中图分类号: 

  • TP391
[1] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C] //2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[2] GIRSHICK R. Fast R-CNN[C] //2015 International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
[3] REN S, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[4] DAI J F, LI Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks [J]. Journal of Neural Information Processing Systems, 2016, 29(6): 379-387.
[5] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]// 2016 European Conference on Computer Vision. Amsterdam: Springer, 2016: 21-37.
[6] REDMON J, DIVVALA S. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[7] LIAO M, SHI B, BAI X. Textboxes++: a single-shot oriented scene text detector [J]. Journal of IEEE Transactions on Image Processing, 2018, 27(8): 3676-3690.
[8] ZHOU X Y, YAO C, WEN H, et al. East: an efficient and accurate scene text detector[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 5551-5560.
[9] ZHANG G J, LU S, ZHANG W. CAD-Net: a context-aware detection network for objects in remote sensing imagery [J]. Journal of IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(12): 10015-10024.
[10] HAN J M, DING J, XUE N. ReDet: a rotation-equivariant detector for aerial object detection[C]// 2021 IEEE Conference on Computer Vision and Pattern Recognition. Kuala Lumpur: IEEE, 2021: 2786-2795.
[11] He K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]// 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2961-2969.
[12] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[13] LIU Z K, WANG H Z, WENG L B, et al. Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds [J]. Journal of IEEE Geoscience and Remote Sensing Letters, 2016, 13(8): 1074-1078.
[14] ZHANG Z H, GUO W W, ZHU S N, et al. Toward arbitrary-oriented ship detection with rotated region proposal and discrimination networks [J]. Journal of IEEE Geoscience and Remote Sensing Letters, 2018, 15(11): 1745-1749.
[15] MA J Q, SHAO W Y, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals [J]. Journal of IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122.
[16] AZIMI S M, VIG E, BAHMANYAR R, et al. Towards multi-class object detection in unconstrained remote sensing imagery[C]// 2018 Asian Conference on Computer Vision. Perth: Springer, 2019: 150-165.
[17] LIU L, PAN Z X, LEI B. Learning a rotation invariant detector with rotatable bounding box[EB/OL]. arXiv: 1711.09405(2015-05-16) [2017-10-26].https://doi.org/10.48500/arXiv.1711.09405.
[18] MING Q, ZHOY Z Q, MIAO L J. Dynamic anchor learning for arbitrary-oriented object detection[C]// 2021 AAAI Conference on Artificial Intelligence. Virtual: AAAI, 2021: 2355-2363.
[19] JADERBERG M, SIMONVAN K, ZISSERRMAN A. Spatial transformer networks [J]. Journal of Advances in Neural Information Processing Systems, 2015, 28(7): 2017-2025.
[20] DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 764-773.
[21] MOOD A M, GRAYBILL F A, BOES D C. Introduction to the theory of statistics [J]. Journal of the American Statistical Association, 1974, 69(348): 25.
[22] WILLMOTT C J, MATSUURA K. Advantages of the mean absolute error over the root mean square error in assessing average model performance [J]. Journal of Climate Research, 2005, 30(1): 79-82.
[23] CANNON A. Quantile regression neural networks: implementation in R and application to precipitation downscaling [J]. Journal of Computers and Geosciences, 2011, 37(9): 1277-1284.
[24] REZATOFIGH H, TSOI N, GWSK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 658-666.
[25] ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]// 2020 AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000.
[26] MAJID S M, VIG E, BAHMANYAR R, et al. Towards multi-class object detection in unconstrained remote sensing imagery[C]// 2018 Asian Conference on Computer Vision. Perth: Springer, 2019: 150-165.
[27] DING J, XUE N, LONG Y, et al. Learning RoI transformer for oriented object detection in aerial images [C]// 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2849-2858.
[28] ZHOU D F, FANG J, SONG X, et al. IoU loss for 2D/3D object detection[C]// 2019 International Conference on 3D Vision (3DV) . Quebec: IEEE, 2019: 85-94.
[29] YANG X, YANG J R, YAN J C, et al. Scrdet: towards more robust detection for small, cluttered and rotated objects[C]// 2019 IEEE International Conference on Computer Vision. Long Beach: IEEE 2019: 8232-8241.
[30] HU W H, WANG T, WANG Y S, et al. LE–MSFE– DDNet: a defect detection network based on low-light enhancement and multi-scale feature extraction [J]. Journal of the Visual Computer, 2022, 38(11): 3731-3745.
[1] 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[2] 杨积升, 章云, 李东. 点云目标检测残差投票网络[J]. 广东工业大学学报, 2022, 39(01): 56-62.
[3] 张国生, 冯广, 李东. 基于姿态表示的航空影像旋转目标检测网络[J]. 广东工业大学学报, 2021, 38(05): 40-47.
[4] 黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58.
[5] 谢岩, 刘广聪. 基于编解码器模型的车道识别与车辆检测算法[J]. 广东工业大学学报, 2019, 36(04): 36-41.
[6] 陈世文1, 2, 蔡念2, 肖明明3. 基于高斯混合模型和canny算法的运动目标检测[J]. 广东工业大学学报, 2011, 28(3): 87-91.
[7] 梁志勇; 易珺; 唐平; 刘文娟; . 帧差法在仓库监控智能跟踪系统中的应用[J]. 广东工业大学学报, 2005, 22(1): 47-52.
[8] 杨澎; 陈少华; . 发电容量最优规划的收益/费用算法[J]. 广东工业大学学报, 1997, 14(2): 102-108.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!