基于姿态表示的航空影像旋转目标检测网络

doi:10.12052/gdutxb.200175

广东工业大学学报 ›› 2021, Vol. 38 ›› Issue (05): 40-47.doi: 10.12052/gdutxb.200175

基于姿态表示的航空影像旋转目标检测网络

张国生, 冯广, 李东

广东工业大学自动化学院，广东广州 510006

收稿日期:2020-12-18 出版日期:2021-09-10 发布日期:2021-07-13
通信作者: 李东(1983–)，男，副教授，博士，主要研究方向为模式识别、机器学习、人脸识别、机器视觉，E-mail：dong.li@gdut.edu.cn E-mail:dong.li@gdut.edu.cn
作者简介:张国生(1995–)，男，硕士研究生，主要研究方向为机器学习、深度学习、图像处理
基金资助:
国家自然科学基金资助项目(61503084)

Pose-based Oriented Object Detection Network for Aerial Images

Zhang Guo-sheng, Feng Guang, Li Dong

School of Automation, Guangdong University of Technology, Guangzhou 510006, China

Received:2020-12-18 Online:2021-09-10 Published:2021-07-13

摘要/Abstract

摘要： 由于航空影像复杂多变的视角, 目标呈现出拥挤、聚集及旋转等特点, 传统目标检测中的水平边框难以契合地表示目标的几何轮廓及位置信息。本文提出了单阶段基于姿态表示的旋转目标检测网络。该网络将不同旋转角目标表示成不同姿态, 通过检测目标的中心位置及回归4个顶点相对坐标来实现旋转目标的检测。同时使用了自适应特征金字塔网络, 利用可学习权重自动从多尺度特征中选择更具判别性的特征。针对航空影像高分辨率的特点, 提出选择性采样策略以提高网络训练效率和缓解网络正负样本不平衡问题。本方法在DOTA遥感数据集旋转目标检测任务上的平均精度(mean Average Precision, mAP)达到74.9%, 超过了现有单阶段甚至部分双阶段的方法。定性与定量的对比实验表明, 基于姿态表示的旋转目标检测网络具有设计简单、检测性能更高的优势。

关键词: 航空影像, 目标检测, 姿态, 旋转

Abstract: Horizontal bounding box representation in traditional object detection is not appropriate for ubiquitous oriented objects in aerial images because of the variant perspective, the crowded, cluttered and oriented objects. Therefore, a one-stage pose-based oriented object detection network is proposed, which represents oriented object as different pose and detect the oriented objects by locating the center and regressing four offsets between center and four vertices. Meanwhile, an adaptive feature pyramid network with learnable weights is utilized to automatically select more discriminative features. Moreover, according to the high resolution of aerial images, selective sampling strategy is proposed to improve the efficiency of network training and alleviate the imbalance problem of positive and negative samples. The proposed method achieves 74.85 mAP on oriented detection task of DOTA dataset, which outperforms the existing one-stage or even two-stage methods. The qualitative and quantitative comparative experiments show that the proposed pose-based oriented object detection network is simple and has competitive detection performance.

Key words: aerial image, object detection, pose, orient

中图分类号:

TP391.4

张国生, 冯广, 李东. 基于姿态表示的航空影像旋转目标检测网络[J]. 广东工业大学学报, 2021, 38(05): 40-47.

Zhang Guo-sheng, Feng Guang, Li Dong. Pose-based Oriented Object Detection Network for Aerial Images[J]. Journal of Guangdong University of Technology, 2021, 38(05): 40-47.

参考文献

[1] XIA G S, BAI X, DING J, et al. DOTA: a large-scale dataset for object detection in aerial images[C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 3974-3983.
[2] UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective search for object recognition [J]. International Journal of Computer Vision, 2013, 104(2): 154-171.
[3] REN S, HE K, GIRSHICK R, et al. Fasterr-cnn: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
[4] HE K, GKIOXARI G, DOLLAR P, et al. Mask r-cnn[C]//IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2961-2969.
[5] REDMON J, FARHADI A. YOLOv3: an incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
[6] 钟映春, 孙思语, 吕帅, 等. 铁塔航拍图像中鸟巢的YOLOv3识别研究[J]. 广东工业大学学报, 2020, 37(3): 42-48.
ZHONG Y C, SUN S Y, LYU S, et al. Recognition of bird’s nest on transmission tower in aerial images of high-voltage power line by YOLOv3 algorithm [J]. Journal of Guangdong University of Technology, 2020, 37(3): 42-48.
[7] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.
[8] ZHOU X, WANG D, KRAHENBUHL P. Objects as points[J]. arXiv preprint arXiv: 1904.07850, 2019.
[9] DING J, XUE N, LONG Y, et al. Learning roi transformer for detecting oriented objects in aerial images[J]. arXiv preprint arXiv: 1812.00155, 2018.
[10] YANG X, YANG J, YAN J, et al. Scrdet: towards more robust detection for small, cluttered and rotated objects[C]//IEEE International Conference on Computer Vision. Seoul: IEEE, 2019: 8232-8241.
[11] XU Y, FU M, WANG Q, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020(99): 1.
[12] QIAN W, YANG X, PENG S, et al. Learning modulated loss for rotated object detection[J]. arXiv preprint arXiv: 1911.08299, 2019.
[13] YANG X, LIU Q, YAN J, et al. R3det: refined single-stage detector with feature refinement for rotating object[J]. arXiv preprint arXiv: 1908.05612, 2019.
[14] PAN X, REN Y, SHENG K, et al. Dynamic refinement network for oriented and densely packed object detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11207-11216.
[15] SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C] //IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5693-5703.
[16] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2117-2125.
[17] ZHU P, WEN L, BIAN X, et al. Vision meets drones: a challenge[J]. arXiv preprint arXiv: 1804.07437, 2018.
[18] EVERINGHAM M, ESLAMI S, WILLIAMS C , et al. The pascal visual object classes (voc) challenge [J]. International Journal of Computer Vision, 2010, 88(2): 303-338.
[19] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: common objects in context[C]//European Conference on Computer Vision. Zurich: Springer, 2014: 740-755.
[20] AZIMI S M, VIG E, BAHMANYAR R, et al. Towards multi-class object detection in unconstrained remote sensing imagery[C]//Asian Conference on Computer Vision. Perth : Springer, 2018: 150-165.
[21] LIN Y, FENG P, GUAN J. Ienet: interacting embranchment one stage anchor free detector for orientation aerial object detection[J]. arXiv preprint arXiv: 1912.00969, 2019.
[22] WEI H, ZHANG Y, CHANG Z, et al. Oriented objects as pairs of middle lines [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 169: 268-279.
[23] ZHANG X, LZQUIERDO E, CHANDRAMOULI K. Dense and small object detection in uav vision based on cascade network[C]//IEEE International Conference on Computer Vision Workshops. Seoul: IEEE, 2019.
[24] ZHANG J, HUANG J, CHEN X, et al. How to fully exploit the abilities of aerial image detectors[C]//IEEE International Conference on Computer Vision Workshops. Seoul: IEEE, 2019.
[25] YANG F, FAN H, CHU P, et al. Clustered object detection in aerial images[C]//IEEE International Conference on Computer Vision. Seoul: IEEE, 2019: 8311-8320.
[26] WANG H, WANG Z, JIA M, et al. Spatial attention for multi-Scale feature refinement for object detection[C]//IEEE International Conference on Computer Vision Workshops. Seoul: IEEE, 2019.
[27] ZHANG P, ZHONG Y, LI X. SlimYOLOv3: Narrower, faster and better for real-time UAV applications[C]//IEEE International Conference on Computer Vision Workshops. Seoul: IEEE, 2019.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于姿态表示的航空影像旋转目标检测网络

Pose-based Oriented Object Detection Network for Aerial Images

HTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 10

Metrics

本文评价

推荐阅读 0

[1]	谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[2]	杨积升, 章云, 李东. 点云目标检测残差投票网络[J]. 广东工业大学学报, 2022, 39(01): 56-62.
[3]	黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58.
[4]	谢岩, 刘广聪. 基于编解码器模型的车道识别与车辆检测算法[J]. 广东工业大学学报, 2019, 36(04): 36-41.
[5]	吴成赫, 刘丽孺, 陈毅刚, 王璋元. 旋转集热板式太阳能烟囱性能研究[J]. 广东工业大学学报, 2018, 35(05): 70-74.
[6]	刘洪伟，石雅强，梁周扬，肖岳. 面向聚类挖掘的局部旋转扰动隐私保护算法[J]. 广东工业大学学报, 2012, 29(3): 28-34.
[7]	陈世文1, 2, 蔡念2, 肖明明3. 基于高斯混合模型和canny算法的运动目标检测[J]. 广东工业大学学报, 2011, 28(3): 87-91.
[8]	梁志勇；易珺；唐平；刘文娟； . 帧差法在仓库监控智能跟踪系统中的应用[J]. 广东工业大学学报, 2005, 22(1): 47-52.
[9]	李江伟；汪仁煌；严仍友； . 便携式旋转机械故障数据采集系统[J]. 广东工业大学学报, 2004, 21(3): 47-50.
[10]	魏辉；余永权； . 对模糊规则优化方法旋转法的进一步研究[J]. 广东工业大学学报, 2000, 17(2): 40-44.