广东工业大学学报 ›› 2023, Vol. 40 ›› Issue (04): 67-76.doi: 10.12052/gdutxb.220139

• 计算机科学与技术 • 上一篇    下一篇

融合迁移学习与YOLOv5的安全帽佩戴检测算法

曹智雄1, 吴晓鸰1, 骆晓伟2, 凌捷1   

  1. 1. 广东工业大学 计算机学院,广东 广州 510006;
    2. 香港城市大学 建筑及土木系,香港 999077
  • 收稿日期:2022-09-13 出版日期:2023-07-25 发布日期:2023-08-02
  • 通信作者: 吴晓鸰(1979-), 女,副教授,博士,主要研究方向为物联网、网络安全,E-mail:xl.wu@gdut.edu.cn
  • 作者简介:曹智雄(1997-), 男,硕士研究生,主要研究方向为智慧城市、图像处理与模式识别
  • 基金资助:
    教育部重点实验室开放课题(2021-1EQBD-02) ;广东省国际科技合作领域项目(2019A050513010)

Helmet Wearing Detection Algorithm Intergrating Transfer Learning and YOLOv5

Cao Zhi-xiong1, Wu Xiao-ling1, Luo Xiao-wei2, Ling Jie1   

  1. 1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China;
    2. Department of Architecture and Civil Engineering, City University of Hong Kong, Hong Kong 999077, China
  • Received:2022-09-13 Online:2023-07-25 Published:2023-08-02

摘要: 针对现有安全帽佩戴检测算法在检测小目标和密集目标时出现漏检、检测准确度低下等问题,本文提出一种基于改进YOLOv5和迁移学习的安全帽佩戴检测新方法。使用K-means算法聚类出更适合检测任务的先验框尺寸以解决默认先验框不适应任务的问题;在特征提取网络后段引入空间通道混合注意力模块,使模型加强对目标权重的学习,抑制无关背景的权重;改进YOLOv5后处理阶段的非极大值抑制(Non-Maximum-Suppression,NMS) 算法的判断度量,减少预测框误删和缺失的现象;采用迁移学习的策略对网络进行训练,克服现有数据集不足的缺陷并提升模型泛化能力;最后提出一种适用于视觉传感网络的安全帽佩戴级联判断框架。实验结果表明改进模型的平均准确率(IOU=0.5)达到了93.6%,与原始模型相比提高了5%,性能优于其他同类算法,提高了施工场景下对安全帽佩戴检测的准确率。

关键词: 安全帽佩戴检测, YOLOv5, 迁移学习, 注意力机制, 视觉传感器网络

Abstract: To address the problems of missing detection and low detection accuracy of the existing helmet wearing detection algorithms for small and crowded targets detection, this paper proposes a helmet wearing detection method based on improved YOLOv5 and transfer learning. First, different from the default priori frame that is not suitable for the task, we use the K-means algorithm to cluster the suitable priori frame size for the detection task. Then, in the back of the feature extraction network, we introduce a spatial channel mixed attention module to strengthen the learning of relevant weights and suppress the weights of irrelevant backgrounds, respectively. Further, we improve the judgment metric of the non-maximum-suppression (NMS) algorithm in the post-processing stage of YOLOv5 to reduce the phenomenon of false deletion and missing of prediction boxes. After that, the proposed network is trained based on the strategy of transfer learning, which can overcome the scarcity of limited existing data sets and improve the generalization ability of the model. Finally, we build a cascade judgment framework for helmet wearing deployed in visual sensor networks. The experimental results show that our proposed method improves the average accuracy (IOU=0.5) to 93.6%, which is 5% higher than the original model in the helmet wearing data set. The proposed model also outperforms other state-of-the-art algorithms by obviously improving the accuracy of helmet wearing detection in the construction scenarios.

Key words: helmet wearing detection, YOLOv5, transfer learning, attention module, visual sensor network

中图分类号: 

  • TP391.41
[1] 郭师虹, 井锦瑞, 张潇丹, 等. 基于改进的YOLOv4安全帽佩戴检测研究[J]. 中国安全生产科学技术, 2021, 17(12): 135-141.GUO S H, JING J R, ZHANG X D, et al. Researchon detection of safety helmet wearing basedon improved YOLOv4[J]. China Safety Production Science and Technology, 2021, 17(12): 135-141.
[2] 岳诗琴, 张乾, 邵定琴, 等. 基于ResNet50-SSD的安全帽佩戴状态检测研究[J]. 长江信息通信, 2021, 34(3): 86-89.YUE S Q, ZHANG Q, SHAO D Q, et al. Safety helmat wearing status detection study based on ResNet50-SSD[J]. Yangtze River Information and Communication, 2021, 34(3): 86-89.
[3] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// CVPR 2014: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587.
[4] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[5] REDMON J, DIVVALA S, GiIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788.
[6] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// IEEE Conference on Computer Vision & Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525.
[7] REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08)[2021-12-21].https://arxiv.org/pdf/1804.02767.pdf.
[8] BOCHKOVSKIY A, WANG C Y, LIAO H. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23)[2021-12-21].https://arxiv.org/pdf/2004.10934.pdf.
[9] 林俊, 党伟超, 潘理虎, 等. 基于YOLO的安全帽检测方法[J]. 计算机系统应用, 2019, 28(9): 174-179.LIN J, DANG W C, PAN L H, et al. Safety helmet detection based on YOLO[J]. Computer System Applications, 2019, 28(9): 174-179.
[10] 方明, 孙腾腾, 邵桢. 基于改进YOLOv2的快速安全帽佩戴情况检测[J]. 光学精密工程, 2019, 27(5): 1196-1205.FANG M, SUN T T, SHAO Z. Fast helmet-wearing-condition detection based onimproved YOLOv2[J]. Optical Precision Engineering, 2019, 27(5): 1196-1205.
[11] CHENG R, HE X W, ZHENG Z L, et al. Multi-scale safety helmet detection based on SAS-YOLOv3-Tiny[J]. Applied Sciences, 2021, 11(8): 3652.
[12] CHEN W, LIU M, ZHOU X, et al. Safety helmet wearing detection in aerial images using improved YOLOv4[J]. Computers, Materials, Continua, 2022(8): 16.
[13] 黄志强, 李军. 基于空间通道注意力机制与多尺度融合的交通标志识别研究[J]. 南京邮电大学学报(自然科学版), 2022, 42(2): 93-102.HUANG Z Q, LI J. Research on traffic sign recognition based on spatial channel attention mechanism and multi-scale fusion[J]. Journal of Nanjing University of Posts and Telecommunications (Natural Science Edition), 2022, 42(2): 93-102.
[14] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV). Munich: Springer, 2018: 3-19.
[15] JIE H, LI S, GANG S, et al. Squeeze-and-excitation networks[EB/OL]. (2019-05-16)[2021-12-21].https://arxiv.org/abs/1709.01507.
[16] YING X, WANG Y, WANG L, et al. A stereo attention module for stereo image super-resolution[J]. IEEE Signal Processing Letters, 2020, 27(99): 496-500.
[17] ZHENG Z, WANG P, LIU W, et al. Distance-IoU Loss: faster and better learning for bounding box regression[EB/OL]. (2019-11-19)[2021-12-21]. https://arxiv.org/abs/1911.08287.
[18] ZHANG D, WU W, CHENG H, et al. Image-to-video person re-identification with temporally memorized similarity learning[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2018, 28(10): 2622-2632.
[1] 张宇, 刘波. 基于自步学习策略的归纳式迁移学习模型研究[J]. 广东工业大学学报, 2023, 40(04): 31-36.
[2] 吴亚迪, 陈平华. 基于用户长短期偏好和音乐情感注意力的音乐推荐模型[J]. 广东工业大学学报, 2023, 40(04): 37-44.
[3] 赖东升, 冯开平, 罗立宏. 基于多特征融合的表情识别算法[J]. 广东工业大学学报, 2023, 40(03): 10-16.
[4] 吴俊贤, 何元烈. 基于通道注意力的自监督深度估计方法[J]. 广东工业大学学报, 2023, 40(02): 22-29.
[5] 刘洪伟, 林伟振, 温展明, 陈燕君, 易闽琦. 基于MABM的消费者情感倾向识别模型——以电影评论为例[J]. 广东工业大学学报, 2022, 39(06): 1-9.
[6] 滕少华, 董谱, 张巍. 融合语义结构的注意力文本摘要模型[J]. 广东工业大学学报, 2021, 38(03): 1-8.
[7] 梁观术, 曹江中, 戴青云, 黄云飞. 一种基于注意力机制的无监督商标检索方法[J]. 广东工业大学学报, 2020, 37(06): 41-49.
[8] 曾碧卿, 韩旭丽, 王盛玉, 徐如阳, 周武. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报, 2019, 36(04): 10-17.
[9] 高俊艳, 刘文印, 杨振国. 结合注意力与特征融合的目标跟踪[J]. 广东工业大学学报, 2019, 36(04): 18-23.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!