广东工业大学学报 ›› 2024, Vol. 41 ›› Issue (02): 93-100.doi: 10.12052/gdutxb.230027
• 计算机科学与技术 • 上一篇
何森柏, 程良伦, 黄国恒, 伍志超, 叶颂航
He Sen-bai, Cheng Liang-lun, Huang Guo-heng, Wu Zhi-chao, Ye Song-hang
摘要: 目标检测广泛应用于工业领域,譬如缺陷检测。然而,在检测过程中依然存在任意旋转和大宽高比问题。一是水平锚框方法难以准确地定位物体;二是卷积神经网络 (Convolutional Neural Networks,CNNs) 在提取特征时表现不佳;三是普通的损失函数对细长的目标不敏感。针对上述问题,本文研究了SR-Det (Slender and Rotated Detecto) 模型,包含以下3个部分。首先是旋转区域校准 (Rotated Region Calibration,RRC) 模块。该算法以不同大小和宽高比的水平提议作为输入,以相应的旋转提议作为输出。然后是旋转角度提议对齐模块 (Rotated Angle Proposal Align,RAP-Align) 来保证特征信息的质量。最后是基于交并比 (Intersection Over Union,IoU) 策略的R-IoU函数 (Rotated Intersection Over Union) 以指导模型最大化预测框和GT (Ground Truth) 框之间的重叠面积。实验证明,本文提出的方法在金属罐数据集和幕墙数据集上取得了最优的效果,证明了该方法的有效性。
中图分类号:
[1] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C] //2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587. [2] GIRSHICK R. Fast R-CNN[C] //2015 International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448. [3] REN S, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. [4] DAI J F, LI Y, He K M, et al. R-FCN: object detection via region-based fully convolutional networks [J]. Journal of Neural Information Processing Systems, 2016, 29(6): 379-387. [5] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]// 2016 European Conference on Computer Vision. Amsterdam: Springer, 2016: 21-37. [6] REDMON J, DIVVALA S. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788. [7] LIAO M, SHI B, BAI X. Textboxes++: a single-shot oriented scene text detector [J]. Journal of IEEE Transactions on Image Processing, 2018, 27(8): 3676-3690. [8] ZHOU X Y, YAO C, WEN H, et al. East: an efficient and accurate scene text detector[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 5551-5560. [9] ZHANG G J, LU S, ZHANG W. CAD-Net: a context-aware detection network for objects in remote sensing imagery [J]. Journal of IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(12): 10015-10024. [10] HAN J M, DING J, XUE N. ReDet: a rotation-equivariant detector for aerial object detection[C]// 2021 IEEE Conference on Computer Vision and Pattern Recognition. Kuala Lumpur: IEEE, 2021: 2786-2795. [11] He K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]// 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2961-2969. [12] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. [13] LIU Z K, WANG H Z, WENG L B, et al. Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds [J]. Journal of IEEE Geoscience and Remote Sensing Letters, 2016, 13(8): 1074-1078. [14] ZHANG Z H, GUO W W, ZHU S N, et al. Toward arbitrary-oriented ship detection with rotated region proposal and discrimination networks [J]. Journal of IEEE Geoscience and Remote Sensing Letters, 2018, 15(11): 1745-1749. [15] MA J Q, SHAO W Y, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals [J]. Journal of IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122. [16] AZIMI S M, VIG E, BAHMANYAR R, et al. Towards multi-class object detection in unconstrained remote sensing imagery[C]// 2018 Asian Conference on Computer Vision. Perth: Springer, 2019: 150-165. [17] LIU L, PAN Z X, LEI B. Learning a rotation invariant detector with rotatable bounding box[EB/OL]. arXiv: 1711.09405(2015-05-16) [2017-10-26].https://doi.org/10.48500/arXiv.1711.09405. [18] MING Q, ZHOY Z Q, MIAO L J. Dynamic anchor learning for arbitrary-oriented object detection[C]// 2021 AAAI Conference on Artificial Intelligence. Virtual: AAAI, 2021: 2355-2363. [19] JADERBERG M, SIMONVAN K, ZISSERRMAN A. Spatial transformer networks [J]. Journal of Advances in Neural Information Processing Systems, 2015, 28(7): 2017-2025. [20] DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 764-773. [21] MOOD A M, GRAYBILL F A, BOES D C. Introduction to the theory of statistics [J]. Journal of the American Statistical Association, 1974, 69(348): 25. [22] WILLMOTT C J, MATSUURA K. Advantages of the mean absolute error over the root mean square error in assessing average model performance [J]. Journal of Climate Research, 2005, 30(1): 79-82. [23] CANNON A. Quantile regression neural networks: implementation in R and application to precipitation downscaling [J]. Journal of Computers and Geosciences, 2011, 37(9): 1277-1284. [24] REZATOFIGH H, TSOI N, GWSK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 658-666. [25] ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]// 2020 AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000. [26] MAJID S M, VIG E, BAHMANYAR R, et al. Towards multi-class object detection in unconstrained remote sensing imagery[C]// 2018 Asian Conference on Computer Vision. Perth: Springer, 2019: 150-165. [27] DING J, XUE N, LONG Y, et al. Learning RoI transformer for oriented object detection in aerial images [C]// 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2849-2858. [28] ZHOU D F, FANG J, SONG X, et al. IoU loss for 2D/3D object detection[C]// 2019 International Conference on 3D Vision (3DV) . Quebec: IEEE, 2019: 85-94. [29] YANG X, YANG J R, YAN J C, et al. Scrdet: towards more robust detection for small, cluttered and rotated objects[C]// 2019 IEEE International Conference on Computer Vision. Long Beach: IEEE 2019: 8232-8241. [30] HU W H, WANG T, WANG Y S, et al. LE–MSFE– DDNet: a defect detection network based on low-light enhancement and multi-scale feature extraction [J]. Journal of the Visual Computer, 2022, 38(11): 3731-3745. |
[1] | 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21. |
[2] | 杨积升, 章云, 李东. 点云目标检测残差投票网络[J]. 广东工业大学学报, 2022, 39(01): 56-62. |
[3] | 张国生, 冯广, 李东. 基于姿态表示的航空影像旋转目标检测网络[J]. 广东工业大学学报, 2021, 38(05): 40-47. |
[4] | 黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58. |
[5] | 谢岩, 刘广聪. 基于编解码器模型的车道识别与车辆检测算法[J]. 广东工业大学学报, 2019, 36(04): 36-41. |
[6] | 陈世文1, 2, 蔡念2, 肖明明3. 基于高斯混合模型和canny算法的运动目标检测[J]. 广东工业大学学报, 2011, 28(3): 87-91. |
[7] | 梁志勇; 易珺; 唐平; 刘文娟; . 帧差法在仓库监控智能跟踪系统中的应用[J]. 广东工业大学学报, 2005, 22(1): 47-52. |
[8] | 杨澎; 陈少华; . 发电容量最优规划的收益/费用算法[J]. 广东工业大学学报, 1997, 14(2): 102-108. |
|