广东工业大学学报 ›› 2021, Vol. 38 ›› Issue (04): 52-58.doi: 10.12052/gdutxb.200147

• • 上一篇    下一篇

基于特征融合的深度学习目标检测算法研究

黄剑航, 王振友   

  1. 广东工业大学 应用数学学院,广东 广州 510520
  • 收稿日期:2020-11-03 出版日期:2021-07-10 发布日期:2021-05-25
  • 通信作者: 王振友(1979–),男,教授,主要研究方向为优化计算、医学统计分析,E-mail:zywang@gdut.edu.cn E-mail:zywang@gdut.edu.cn
  • 作者简介:黄剑航(1996-),男,硕士研究生,主要研究方向为目标检测
  • 基金资助:
    广东省基础与应用基础研究基金资助项目(2020B1515310001)

A Research on Deep Learning Object Detection Algorithm Based on Feature Fusion

Huang Jian-hang, Wang Zhen-you   

  1. School of Applied Mathematics, Guangdong University of Technology, Guangzhou 510520, China
  • Received:2020-11-03 Online:2021-07-10 Published:2021-05-25

摘要: 通过研究卷积神经网络中的特征层级, 发现高层特征图的分辨率低、语义信息强, 低层特征图的分辨率强、语义信息较弱等问题。针对上述问题提出一种二次特征融合的目标检测算法, 该算法在特征金字塔网络(Feature Pyramid Networks, FPN)的基础上对过渡特征重复使用并进行二次特征融合, 使丰富的低层特征信息补充到高层。最终在COCO2014的数据集上平均精度AP(Average Precision)、AP50、AP75分别达到了35.3%, 57.5%, 36.6%, 与未使用特征融合方法以及使用传统特征融合的方法相比, 分别提升了2.4%, 3.7%, 2.4%, 能改善漏检情况和有利于小目标的检测。

关键词: 特征融合, 目标检测, 卷积神经网络, 特征复用

Abstract: Through the study of feature levels in convolutional neural networks, this paper found that high-level feature have stronger semantic information and low resolution, and low-level features have strong resolution and weaker semantic information. Aiming at these problems, a object detection algorithm based on secondary feature fusion is proposed. The algorithm reuses transitional features and performs secondary feature fusion on the basis of Feature Pyramid Networks to supplement the rich low-level feature information to the top. Finally, the average accuracy of AP, AP50, and AP75 on the COCO2014 data set reach 35.3%, 57.5%, and 36.6%, respectively. Compared with the unused feature fusion method and the traditional feature fusion method, the average accuracy is increased by 2.4%, 3.7% and 2.4%, which significantly improves the missed detection and the detection of small targets.

Key words: feature fusion, object detection, convolutional neural network, feature reuse

中图分类号: 

  • TP242.6+2
[1] HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN [J]. IEEE transactions on pattern analysis & machine intelligence, 2020, 42(2): 386-397.
[2] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 779-788.
[3] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[4] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Honolulu: IEEE, 2017: 6517-6525.
[5] LI Y, CHEN Y, WANG N, et al. Scale-aware trident networks for object detection[C]//IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019: 6053-6062.
[6] BHARAT S, MAHYAR N, LARRY S D. SNIPER: efficient multi-scale training[J]. arXiv preprint arXiv: 1805.09300, 2018.
[7] LIU W, ANUELOVG D, ERHAN D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision. Berlin: Springer, Cham, 2016: 21-37.
[8] KONG T, YAO A, CHEN Y, et al. Hypernet: Towards accurate region proposal generation and joint object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas: IEEE, 2016: 845-853.
[9] LIU S T, HUANG D, WANG Y H. Receptive field block net for accurate and fast object detection[J]. arXiv preprint arXiv: 1711.07767, 2017.
[10] LIN TY, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 2117-2125.
[11] LI Z, ZHOU F Q. FSSD: feature fusion single shot multibox detector[J]. arXiv preprint arXiv: 1712.00960, 2017.
[12] FU C Y, LIU W, RANGA A, et al. Dssd: deconvolutional single shot detector[J]. arXiv preprint arXiv: 1701.06659. 2017.
[13] 温捷文, 战萌伟, 李楚宏, 等. 一种加强SSD小目标检测能力的Atrous滤波器设计[J]. 计算机应用研究, 2019, 36(3): 861-865, 872.
WEN J W, ZHANM W, LI C H, et al. Design of Atrous filter to strengthen small object detection capability of SSD [J]. Application Research of Computers, 2019, 36(3): 861-865, 872.
[14] 高俊艳, 刘文印, 杨振国. 结合注意力与特征融合的目标跟踪[J]. 广东工业大学学报, 2019, 36(4): 18-23.
GAO J Y, LIU W Y, YANG Z G. Object tracking combined with attention and feature fusion [J]. Journal of Guangdong University of Technology, 2019, 36(4): 18-23.
[15] HE K, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas: IEEE, 2016: 770-778.
[16] REDMON J, FARHADI A. YOLOV3: an incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
[17] ABADI M, AGARWAL A, BARHAM P, et al. Tensorflow: large-scale machine learning on heterogeneous distributed systems[J]. arXiv preprint arXiv: 1603.04467, 2016.
[18] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge [J]. International journal of computer vision, 2015, 115(3): 211-252.
[19] DAI J F, LI Y, HE K. R-FCN: object detection via region-based fully convolutional networks[J]. arXiv preprint arXiv: 1605.06409, 2016.
[1] 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[2] 章云, 王晓东. 基于受限样本的深度学习综述与思考[J]. 广东工业大学学报, 2022, 39(05): 1-8.
[3] 杨积升, 章云, 李东. 点云目标检测残差投票网络[J]. 广东工业大学学报, 2022, 39(01): 56-62.
[4] 张国生, 冯广, 李东. 基于姿态表示的航空影像旋转目标检测网络[J]. 广东工业大学学报, 2021, 38(05): 40-47.
[5] 马少鹏, 梁路, 滕少华. 一种轻量级的高光谱遥感图像分类方法[J]. 广东工业大学学报, 2021, 38(03): 29-35.
[6] 夏皓, 蔡念, 王平, 王晗. 基于多分辨率学习卷积神经网络的磁共振图像超分辨率重建[J]. 广东工业大学学报, 2020, 37(06): 26-31.
[7] 战荫伟, 朱百万, 杨卓. 车辆颜色和型号识别算法研究与应用[J]. 广东工业大学学报, 2020, 37(04): 9-14.
[8] 曾碧卿, 韩旭丽, 王盛玉, 徐如阳, 周武. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报, 2019, 36(04): 10-17.
[9] 高俊艳, 刘文印, 杨振国. 结合注意力与特征融合的目标跟踪[J]. 广东工业大学学报, 2019, 36(04): 18-23.
[10] 谢岩, 刘广聪. 基于编解码器模型的车道识别与车辆检测算法[J]. 广东工业大学学报, 2019, 36(04): 36-41.
[11] 杨孟军, 苏成悦, 陈静, 张洁鑫. 基于卷积神经网络的视觉闭环检测研究[J]. 广东工业大学学报, 2018, 35(05): 31-37.
[12] 陈旭, 张军, 陈文伟, 李硕豪. 卷积网络深度学习算法与实例[J]. 广东工业大学学报, 2017, 34(06): 20-26.
[13] 申小敏, 李保俊, 孙旭, 徐维超. 基于卷积神经网络的大规模人脸聚类[J]. 广东工业大学学报, 2016, 33(06): 77-84.
[14] 孙伟, 钟映春, 谭志, 连伟烯. 多特征融合的室内场景分类研究[J]. 广东工业大学学报, 2015, 32(1): 75-79.
[15] 陈世文1, 2, 蔡念2, 肖明明3. 基于高斯混合模型和canny算法的运动目标检测[J]. 广东工业大学学报, 2011, 28(3): 87-91.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!