基于YOLOv5的轻量化无人机航拍小目标检测算法

doi:10.12052/gdutxb.230044

广东工业大学学报 ›› 2024, Vol. 41 ›› Issue (03): 71-80.doi: 10.12052/gdutxb.230044

基于YOLOv5的轻量化无人机航拍小目标检测算法

李雪森¹, 谭北海², 余荣¹, 薛先斌¹

1. 广东工业大学自动化学院, 广东广州 510006;
2. 广东工业大学集成电路学院, 广东广州 510006

收稿日期:2023-03-04 出版日期:2024-05-25 发布日期:2024-06-14
通信作者: 谭北海(1980-),男,副教授,博士,硕士生导师,主要研究方向为AI算法及芯片设计、人工智能、深度学习,E-mail:bhtan@gdut.edu.cn
作者简介:李雪森(1997-),男,硕士研究生,主要研究方向为深度学习,E-mail:18325945913@163.com
基金资助:
国家自然科学基金资助项目(61971148)；国家自然科学基金资助项目(U22A2054)；广东省基础与应用基础研究基金联合基金重点项目(2019B1515120036)；广西自然科学基金重点项目(2018GXNSFDA281013)

Small Target Detection Algorithm for Lightweight UAV Aerial Photography Based on YOLOv5

Li Xue-sen¹, Tan Bei-hai², Yu Rong¹, Xue Xian-bin¹

1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China;
2. School of Integrated Circuits, Guangdong University of Technology, Guangzhou 510006, China

Received:2023-03-04 Online:2024-05-25 Published:2024-06-14

摘要/Abstract

摘要： 针对无人机航拍视角下图像目标特征尺寸小且存在背景复杂、分布密集的问题，提出了一种基于YOLOv5的轻量化无人机航拍小目标检测改进算法GA-YOLO。该算法改进了Mosaic数据增强方法和网络整体结构，并增加了微小物体检测头，同时设计了轻量化的全局注意力模块和并行结构的空间通道注意力机制模块，提高了网络的全局特征提取能力和训练过程中卷积通道之间的竞争和合作关系。以4.0版本的YOLOv5s为基准，在公开无人机航拍数据集VisDrone2019-DET上实验，结果表明，改进后的模型相较于原模型，参数量下降了48%，计算量下降了26%，而mAP@0.5提高了4.9个百分点，mAP@0.5:0.95提高了3.3个百分点，有效地提高了无人机空中视角下对密集型小目标的检测能力。

关键词: 无人机航拍, YOLOv5s, 小目标检测, 数据增强, 注意力机制

Abstract: A lightweight unmanned aerial vehicle (UAV) aerial photography small target detection algorithm GA-YOLO based on YOLOv5 is proposed to address the problem of small target feature size, complex background, and dense distribution in images from the perspective of UAV aerial photography. This algorithm improves the Mosaic data augmentation method and overall network structure, and adds a small object detection head. At the same time, a lightweight global attention module and a parallel spatial channel attention mechanism module are designed to enhance the network's global feature extraction ability and the competition and cooperation between convolutional channels during the training process. Based on the 4.0 version of YOLOv5s, experiments were conducted on the publicly available drone aerial photography dataset VisDrone2019-DET. The results showed that the improved model reduced the number of parameters by 48% and the computational complexity by 26% compared to the original model, and mAP@0.5 improved by 4.9 percentage points, mAP@0.5 0.95 increased by 3.3 percentage points, effectively enhancing the detection capability of unmanned aerial vehicles for dense small targets from an aerial perspective.

Key words: UAV aerial photography, YOLOv5s, small target detection, data enhancement, attention mechanism

中图分类号:

TP391.41

李雪森, 谭北海, 余荣, 薛先斌. 基于YOLOv5的轻量化无人机航拍小目标检测算法[J]. 广东工业大学学报, 2024, 41(03): 71-80.doi: 10.12052/gdutxb.230044

Li Xue-sen, Tan Bei-hai, Yu Rong, Xue Xian-bin. Small Target Detection Algorithm for Lightweight UAV Aerial Photography Based on YOLOv5[J]. Journal of Guangdong University of Technology, 2024, 41(03): 71-80.doi: 10.12052/gdutxb.230044

参考文献

[1] 曹家乐, 李亚利, 孙汉卿, 等. 基于深度学习的视觉目标检测技术综述[J]. 中国图象图形学报, 2022, 27(6): 1697-1722.
CAO J L, LI Y L, SUN H Q, et al. A survey on deep learning based visual object detection [J]. China Journal of Image and Graphics, 2022, 27(6): 1697-1722.
[2] 戴文君, 常天庆, 张雷, 等. 图像目标检测技术在坦克火控系统中的应用[J]. 火力与指挥控制, 2020, 45(7): 147-152.
DAI W J, CHANG T Q, ZHANG L, et al. Application of image target detection technology in tank fire control system [J]. Fire and Command Control, 2020, 45(7): 147-152.
[3] LIO W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C] //Computer Vision–ECCV 2016: 14th European Conference. Amsterdam, Netherlands: Springer International Publishing, 2016: 21-37.
[4] ZHAI S, SHANG D, WANG S, et al. DF-SSD: an improved SSD object detection algorithm based on DenseNet and feature fusion [J]. IEEE Access, 2020, 8: 24344-24357.
[5] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[6] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR) . Hawaii: IEEE, 2017: 7263-7271.
[7] REDMON J, FARHADI A. Yolov3: An incremental improvement[EB/OL]. arXiv: 1804.02767 (2018-04-08) [2023-02-07]. https://arxiv.53yu.com/abs/1804.02767.
[8] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: optimal speed and accuracy of object detection[EB/OL]. arXiv: 2004.10934 (2020-04-22) [2023-02-07]. https://arxiv.org/abs/2004.10934.
[9] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C] //Proceedings of the IEEE International Conference on Computer Vision. Hong Kong: IEEE, 2017: 2980-2988.
[10] GIRSHICK R. Fast R-CNN[C] //Proceedings of the IEEE International Conference on Computer Vision. Santiago Chile: IEEE, 2015: 1440-1448.
[11] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
[12] PURKAIT P, ZHAO C, ZACH C. SPP-Net: deep absolute pose regression with synthetic views[EB/OL]. arXiv: 1712.03452 (2017-12-09) [2023-02-09]. https://arxiv.53yu.com/abs/1712.03452.
[13] LI P, CHE C. SeMo-YOLO: a multiscale object detection network in satellite remote sensing images[C] //2021 International Joint Conference on Neural Networks (IJCNN) . Shenzhen: IEEE, 2021: 1-8.
[14] TAN L, LV X, LIAN X, et al. YOLOv4_Drone: UAV image target detection based on an improved YOLOv4 algorithm [J]. Computers & Electrical Engineering, 2021, 93: 107261.
[15] WANG M, LI Q, GU Y, et al. SCAF-net: Scene context attention-based fusion network for vehicle detection in aerial imagery [J]. IEEE Geoscience and Remote Sensing Letters, 2021, 19: 1-5.
[16] ZHANG X, ZHOU X, LIN M, et al. Shufflenet: an extremely efficient convolutional neural network for mobile devices[C] //Proceedings of the IEEE Eonference on Computer Vision and Pattern Recognition. Wellington New Zealand: IEEE, 2018: 6848-6856.
[17] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C] //Proceedings of the European Conference on Computer Vision (ECCV) . Munich: EACV, 2018: 3-19.
[18] GUO M H, LU C Z, LIU Z N, et al. Visual attention network [J]. Computational Visual Media, 2023, 9(4): 733-752.
[19] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. arXiv: 2010.11929 (2021-06-03) [2023-02-11]. https://arxiv.53yu.com/abs/2010.11929.
[20] MEHTA S, RASTEGARI M. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer[EB/OL]. arXiv: 2110.02178 (2022-03-04) [2023-02-11]. https://arxiv.53yu.com/abs/2110.02178.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于YOLOv5的轻量化无人机航拍小目标检测算法

Small Target Detection Algorithm for Lightweight UAV Aerial Photography Based on YOLOv5

HTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 13

Metrics

本文评价

推荐阅读 0

[1]	涂泽良, 程良伦, 黄国恒. 基于局部正交特征融合的小样本图像分类[J]. 广东工业大学学报, 2024, 41(02): 73-83.
[2]	杨镇雄, 谭台哲. 基于生成对抗网络的低光照图像增强算法[J]. 广东工业大学学报, 2024, 41(01): 55-62.
[3]	赖志茂, 章云, 李东. 基于Transformer的人脸深度伪造检测技术综述[J]. 广东工业大学学报, 2023, 40(06): 155-167.
[4]	曾安, 陈旭宙, 姬玉柱, 潘丹, 徐小维. 基于自注意力和三维卷积的心脏多类分割方法[J]. 广东工业大学学报, 2023, 40(06): 168-175.
[5]	吴亚迪, 陈平华. 基于用户长短期偏好和音乐情感注意力的音乐推荐模型[J]. 广东工业大学学报, 2023, 40(04): 37-44.
[6]	曹智雄, 吴晓鸰, 骆晓伟, 凌捷. 融合迁移学习与YOLOv5的安全帽佩戴检测算法[J]. 广东工业大学学报, 2023, 40(04): 67-76.
[7]	赖东升, 冯开平, 罗立宏. 基于多特征融合的表情识别算法[J]. 广东工业大学学报, 2023, 40(03): 10-16.
[8]	吴俊贤, 何元烈. 基于通道注意力的自监督深度估计方法[J]. 广东工业大学学报, 2023, 40(02): 22-29.
[9]	刘洪伟, 林伟振, 温展明, 陈燕君, 易闽琦. 基于MABM的消费者情感倾向识别模型——以电影评论为例[J]. 广东工业大学学报, 2022, 39(06): 1-9.
[10]	滕少华, 董谱, 张巍. 融合语义结构的注意力文本摘要模型[J]. 广东工业大学学报, 2021, 38(03): 1-8.
[11]	梁观术, 曹江中, 戴青云, 黄云飞. 一种基于注意力机制的无监督商标检索方法[J]. 广东工业大学学报, 2020, 37(06): 41-49.
[12]	曾碧卿, 韩旭丽, 王盛玉, 徐如阳, 周武. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报, 2019, 36(04): 10-17.
[13]	高俊艳, 刘文印, 杨振国. 结合注意力与特征融合的目标跟踪[J]. 广东工业大学学报, 2019, 36(04): 18-23.