SR-Det：面向工业场景下细长和旋转目标的鲁棒检测

doi:10.12052/gdutxb.230027

摘要/Abstract

摘要： 目标检测广泛应用于工业领域，譬如缺陷检测。然而，在检测过程中依然存在任意旋转和大宽高比问题。一是水平锚框方法难以准确地定位物体；二是卷积神经网络 (Convolutional Neural Networks，CNNs) 在提取特征时表现不佳；三是普通的损失函数对细长的目标不敏感。针对上述问题，本文研究了SR-Det (Slender and Rotated Detecto) 模型，包含以下3个部分。首先是旋转区域校准 (Rotated Region Calibration，RRC) 模块。该算法以不同大小和宽高比的水平提议作为输入，以相应的旋转提议作为输出。然后是旋转角度提议对齐模块 (Rotated Angle Proposal Align，RAP-Align) 来保证特征信息的质量。最后是基于交并比 (Intersection Over Union，IoU) 策略的R-IoU函数 (Rotated Intersection Over Union) 以指导模型最大化预测框和GT (Ground Truth) 框之间的重叠面积。实验证明，本文提出的方法在金属罐数据集和幕墙数据集上取得了最优的效果，证明了该方法的有效性。

关键词: 目标检测, 损失函数, 旋转不变性

Abstract: Though object detection has been widely used in the industrial scene, it still faces the detection problems of crack defects with slender and rotated characteristics. On the one hand, traditional horizontal anchor methods are usually hard to precisely locate the object. On the other hand, CNNs (Convolutional Neural Networks) perform poorly in terms of feature extraction from rotated objects. In addition, normal loss functions are insensitive to slender objects. To address these, this paper proposes a Slender and Rotated Detector (SR-Det) for robust slender and rotated object detection. Specifically, the Rotated Region Calibration (RRC) is designed, which takes horizontal proposals with different scales and aspect ratios as inputs and outputs the corresponding rotation proposals. Then, the Rotated Angle Proposal Align (RAP-Align) is presented to guarantee the quality of extracted feature information. Finally, the Rotated intersection over union(R-IoU) based on Intersection Over Union (IoU) strategy is proposed for guiding the model to maximize the area between predicted box and Ground Truth box. The experiments on metal cans and curtain walls datasets have shown that the method proposed achieves state-of-the-art performance, demonstrating the effectiveness of the proposed algorithm.

Key words: object detection, loss function, rotation invariance

中图分类号:

TP391

何森柏, 程良伦, 黄国恒, 伍志超, 叶颂航. SR-Det：面向工业场景下细长和旋转目标的鲁棒检测[J]. 广东工业大学学报, 2024, 41(02): 93-100.

He Sen-bai, Cheng Liang-lun, Huang Guo-heng, Wu Zhi-chao, Ye Song-hang. SR-Det:Towards Robust Detection of Slender and Rotated Objects in Industrial Scene[J]. Journal of Guangdong University of Technology, 2024, 41(02): 93-100.

参考文献

[1] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C] //2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[2] GIRSHICK R. Fast R-CNN[C] //2015 International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
[3] REN S, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[4] DAI J F, LI Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks [J]. Journal of Neural Information Processing Systems, 2016, 29(6): 379-387.
[5] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]// 2016 European Conference on Computer Vision. Amsterdam: Springer, 2016: 21-37.
[6] REDMON J, DIVVALA S. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[7] LIAO M, SHI B, BAI X. Textboxes++: a single-shot oriented scene text detector [J]. Journal of IEEE Transactions on Image Processing, 2018, 27(8): 3676-3690.
[8] ZHOU X Y, YAO C, WEN H, et al. East: an efficient and accurate scene text detector[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 5551-5560.
[9] ZHANG G J, LU S, ZHANG W. CAD-Net: a context-aware detection network for objects in remote sensing imagery [J]. Journal of IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(12): 10015-10024.
[10] HAN J M, DING J, XUE N. ReDet: a rotation-equivariant detector for aerial object detection[C]// 2021 IEEE Conference on Computer Vision and Pattern Recognition. Kuala Lumpur: IEEE, 2021: 2786-2795.
[11] He K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]// 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2961-2969.
[12] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[13] LIU Z K, WANG H Z, WENG L B, et al. Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds [J]. Journal of IEEE Geoscience and Remote Sensing Letters, 2016, 13(8): 1074-1078.
[14] ZHANG Z H, GUO W W, ZHU S N, et al. Toward arbitrary-oriented ship detection with rotated region proposal and discrimination networks [J]. Journal of IEEE Geoscience and Remote Sensing Letters, 2018, 15(11): 1745-1749.
[15] MA J Q, SHAO W Y, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals [J]. Journal of IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122.
[16] AZIMI S M, VIG E, BAHMANYAR R, et al. Towards multi-class object detection in unconstrained remote sensing imagery[C]// 2018 Asian Conference on Computer Vision. Perth: Springer, 2019: 150-165.
[17] LIU L, PAN Z X, LEI B. Learning a rotation invariant detector with rotatable bounding box[EB/OL]. arXiv: 1711.09405(2015-05-16) [2017-10-26].https://doi.org/10.48500/arXiv.1711.09405.
[18] MING Q, ZHOY Z Q, MIAO L J. Dynamic anchor learning for arbitrary-oriented object detection[C]// 2021 AAAI Conference on Artificial Intelligence. Virtual: AAAI, 2021: 2355-2363.
[19] JADERBERG M, SIMONVAN K, ZISSERRMAN A. Spatial transformer networks [J]. Journal of Advances in Neural Information Processing Systems, 2015, 28(7): 2017-2025.
[20] DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 764-773.
[21] MOOD A M, GRAYBILL F A, BOES D C. Introduction to the theory of statistics [J]. Journal of the American Statistical Association, 1974, 69(348): 25.
[22] WILLMOTT C J, MATSUURA K. Advantages of the mean absolute error over the root mean square error in assessing average model performance [J]. Journal of Climate Research, 2005, 30(1): 79-82.
[23] CANNON A. Quantile regression neural networks: implementation in R and application to precipitation downscaling [J]. Journal of Computers and Geosciences, 2011, 37(9): 1277-1284.
[24] REZATOFIGH H, TSOI N, GWSK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 658-666.
[25] ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]// 2020 AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000.
[26] MAJID S M, VIG E, BAHMANYAR R, et al. Towards multi-class object detection in unconstrained remote sensing imagery[C]// 2018 Asian Conference on Computer Vision. Perth: Springer, 2019: 150-165.
[27] DING J, XUE N, LONG Y, et al. Learning RoI transformer for oriented object detection in aerial images [C]// 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2849-2858.
[28] ZHOU D F, FANG J, SONG X, et al. IoU loss for 2D/3D object detection[C]// 2019 International Conference on 3D Vision (3DV) . Quebec: IEEE, 2019: 85-94.
[29] YANG X, YANG J R, YAN J C, et al. Scrdet: towards more robust detection for small, cluttered and rotated objects[C]// 2019 IEEE International Conference on Computer Vision. Long Beach: IEEE 2019: 8232-8241.
[30] HU W H, WANG T, WANG Y S, et al. LE–MSFE– DDNet: a defect detection network based on low-light enhancement and multi-scale feature extraction [J]. Journal of the Visual Computer, 2022, 38(11): 3731-3745.

Metrics

Viewed

Full text

362

HTML			PDF

Just accepted	Online first	Issue	Just accepted	Online first	Issue
0	0	0	0	0	362

From	Others	local

Times	51	311
Rate	14%	86%

Abstract

222

Just accepted	Online first	Issue

0	0	222

	From	local

	Times	222
	Rate	100%

Cited

Web of Science	Crossref	ScienceDirect	Search for Citations in Google Scholar >>


This page requires you have already subscribed to WoS.

Shared

Discussed