广东工业大学学报 ›› 2022, Vol. 39 ›› Issue (01): 56-62.doi: 10.12052/gdutxb.200176

• 综合研究 • 上一篇    下一篇

点云目标检测残差投票网络

杨积升, 章云, 李东   

  1. 广东工业大学 自动化学院,广东 广州 510006
  • 收稿日期:2020-12-30 发布日期:2022-01-20
  • 通信作者: 李东(1983-),男,副教授,博士,硕士生导师,主要研究方向为模式识别、机器学习、人脸识别和机器视觉,E-mail:dong.li@gdut.edu.cn
  • 作者简介:杨积升(1995-),男,硕士研究生,主要研究方向为点云处理、模式识别
  • 基金资助:
    国家自然科学基金资助项目(61503084);广东省自然科学基金资助项目(2021A1515011867)

A Residual Neural Network with Voting for 3D Object Detection in Point Clouds

Yang Ji-sheng, Zhang Yun, Li Dong   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2020-12-30 Published:2022-01-20

摘要: 高精度的三维目标检测是实现物体感知的关键技术, 对自动驾驶、机器人控制等应用的落地具有重要意义。为提高三维目标检测的精度, 对算法VoteNet改进, 提出了一种基于残差网络的端到端的高精度三维点云目标检测网络ResVoteNet。具体来说, 设计了适用于点云数据的残差网络骨架, 提出了残差特征提取模块以及残差上采样模块, 并集成进VoteNet框架。残差网络结构的引入增强了网络对点云数据的特征提取和学习能力, 并且提高了模型的鲁棒性。该算法在公开的大规模点云数据集SCANNET和SUN-RGBD上进行实验, 平均检测精度mAP分别达到61.1%和59.9%, 超越了当前最先进水平的其他算法。

关键词: 三维点云, 目标检测, 残差网络

Abstract: High-precision 3D object detection is a key technology to realize object perception, which is of great significance to the implementation of applications such as automatic driving and robot control. In order to improve the accuracy of 3D object detection, the algorithm VoteNet is improved, and an end-to-end high-precision 3D point cloud target detection network based on residual network, ResVoteNet is proposed. Specifically, a residual network skeleton suitable for point cloud data is designed, and a residual feature extraction module and a residual up-sampling module are proposed and integrated into the VoteNet framework. The introduction of the residual network structure enhances the network's feature extraction and learning capabilities for point cloud data, and improves the robustness of the model. The algorithm is tested on the publicly available large-scale point cloud data sets SCANNET and SUN-RGBD, and the average detection accuracy mAP has reached 61.1% and 59.9%, respectively, surpassing other current state-of-the-art algorithms.

Key words: 3D point cloud, object detection, residual network

中图分类号: 

  • TP391.4
[1] 王玺. 谷歌公布无人驾驶车事故细节未来交通尚仍是梦[J]. 华东科技, 2016, 4(1): 10-15.
[2] YAN Y, MAO Y, LI B. Second: Sparsely embedded convolutional detection [J]. Sensors, 2018, 18(10): 3337.
[3] LANG A H, VORA S, CAESAR H, et al. Pointpillars: fast encoders for object detection from point clouds[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 12697-12705.
[4] 杨泽鑫, 彭林才, 刘定宁, 等. 基于PCL和Qt的点云处理系统设计与开发[J]. 广东工业大学学报, 2017, 34(6): 61-67.
YANG Z X, PENG L C, LIU D N, et al. Development of point cloud processing system based on PCL and Qt [J]. Journal of Guangdong University of Technology, 2017, 34(6): 61-67.
[5] ZHOU Y, TUZEL O. Voxelnet: end-to-end learning for point cloud based 3D object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4490-4499.
[6] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[7] QI C R, LIU W, WU C, et al. Frustum pointnets for 3D object detection from RGB-D data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 918-927.
[8] QI C R, SU H, MO K, et al. Pointnet: deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI: IEEE, 2017: 652-660.
[9] QI C R, LITANY O, HE K, et al. Deep hough voting for 3D object detection in point clouds[C]//Proceedings of the IEEE International Conference on Computer Vision. Long Beach: IEEE, 2019: 9277-9286.
[10] SHI S, GUO C, JIANG L, et al. PV-RCNN: point-voxel feature set abstraction for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10529-10538.
[11] LIU Z, ZHAO X, HUANG T, et al. TANet: robust 3D object detection from point clouds with triple attention[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 11677-11684.
[12] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[13] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
[14] DAI A, CHANG A X, SAVVA M, et al. Scannet: richly-annotated 3D reconstructions of indoor scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5828-5839.
[15] SONG S, LICHTENBERG S P, XIAO J. Sun RGB-D: a RGB-D scene understanding benchmark suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 567-576.
[16] SONG S, XIAO J. Deep sliding shapes for a modal 3D object detection in RGB-D images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 808-816.
[17] HOU J, DAI A, NIEBNER M. 3D-SIS: 3D semantic instance segmentation of RGB-D scans[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4421-4430.
[18] LAHOUD J, GHANEM B. 2D-driven 3D object detection in RGB-D images[C]//Proceedings of the IEEE International Conference on Computer Vision. Honolulu: IEEE, 2017: 4622-4630.
[19] REN Z, SUDDERTH E B. Three-dimensional object detection and layout prediction using clouds of oriented gradients[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 1525-1533.
[20] HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Honolulu: IEEE, 2017: 2961-2969.
[21] YI L, ZHAO W, WANG H, et al. GSPN: generative shape proposal network for 3D instance segmentation in point cloud[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3947-3956.
[22] QI C R, YI L, SU H, et al. Pointnet++: deep hierarchical feature learning on point sets in a metric space [J]. Advances in Neural Information Processing Systems, 2017, 30: 5099-5108.
[1] 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[2] 张国生, 冯广, 李东. 基于姿态表示的航空影像旋转目标检测网络[J]. 广东工业大学学报, 2021, 38(05): 40-47.
[3] 黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58.
[4] 谢岩, 刘广聪. 基于编解码器模型的车道识别与车辆检测算法[J]. 广东工业大学学报, 2019, 36(04): 36-41.
[5] 陈世文1, 2, 蔡念2, 肖明明3. 基于高斯混合模型和canny算法的运动目标检测[J]. 广东工业大学学报, 2011, 28(3): 87-91.
[6] 梁志勇; 易珺; 唐平; 刘文娟; . 帧差法在仓库监控智能跟踪系统中的应用[J]. 广东工业大学学报, 2005, 22(1): 47-52.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!