广东工业大学学报

• •    

用于三维点云分割和分类的高分辨率特征网络

朱骏杰, 刘东峰   

  1. 广东工业大学 信息工程学院, 广东 广州 510006
  • 收稿日期:2024-06-21 出版日期:2025-01-06 发布日期:2025-01-06
  • 通信作者: 刘东峰(1969–) ,男,教授,博士,主要研究方向为虚拟现实技术、机器学习、计算机视觉,E-mail:liudf@gdut.edu.cn
  • 作者简介:朱骏杰(2000–) ,男,硕士研究生,主要研究方向为点云处理,E-mail:2112203082@mail2.gdut.edu.cn
  • 基金资助:
    广东省自然科学基金资助项目(2024A1515012058)

High-resolution Feature Network for 3D Point Cloud Segmentation and Classification

Zhu Jun-jie, Liu Dong-feng   

  1. School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2024-06-21 Online:2025-01-06 Published:2025-01-06

摘要: 多尺度特征在点云领域的密集预测任务中至关重要。当前三维点云处理技术主要依赖编码器–解码器框架,通过主干网络提取并融合多尺度特征。然而,这些方法通常采用延迟融合策略,导致特征集成不足。为解决这一问题,本文提出了HRFN3D(High-resolution Feature Network for 3D Point Cloud)模型,一种专为点云分类和分割任务设计的高分辨率特征网络。HRFN3D通过创新性的关系学习模块,在早期阶段进行特征融合,促进低分辨率高语义点与高分辨率低语义点的交互,使高分辨率点在早期阶段就保留高语义信息,优化后续特征学习。在后期,结合不同池化策略生成全局特征向量,并与原始点特征拼接,既保留细节,又增强全局特征的代表性。实验结果显示,HRFN3D在ShapeNetPart数据集上将类平均交并比和实例平均交并比分别提升了2.2个百分点和0.9个百分点,并获得了最佳实例平均交并比86.3%;在ModelNet40数据集上,以4.3 M的参数量实现了91.5%的最高类平均精度。这些结果验证了HRFN3D在多尺度特征处理中的有效性。

关键词: 多尺度特征, 三维点云处理, 高分辨率, 特征融合, 早期阶段

Abstract: Multi-scale features are critical in dense prediction tasks within the point cloud domain. Existing 3D point cloud processing techniques predominantly rely on encoder-decoder frameworks, which extract and integrate multiscale features via a backbone network. However, these methods often employ delayed fusion strategies, resulting in insufficient feature integration. To address this issue, this paper introduces a novel high-resolution feature network for 3D point cloud, named HRFN3D, specifically for point cloud classification and segmentation tasks. HRFN3D innovatively employs a relational learning module to perform feature fusion at an early stage, facilitating interactions between low-resolution high-semantic points and high-resolution low-semantic points. This early fusion ensures that high-resolution points retain semantic information from the outset, facilitating subsequent feature learning. In the later stage, the global feature vectors are generated by combining different pooling strategies and spliced with the original point features, preserving the details and enhancing the representation of the global features. The experimental results show that HRFN3D improves the Class mean and Instance mean Intersection over Union by 2.2% and 0.9%, respectively, and achieves the average class ratio of 86.3%. On the ModelNet40 data set, our proposed method achieves the highest class average accuracy of 91.5% with 4.3M parameters. These results validate the effectiveness of HRFN3D in multi-scale feature processing.

Key words: multi-scale features, 3D point cloud processing, high-resolution, feature fusion, early stage

中图分类号: 

  • TP391.4
[1] ALEXIOU E, YANG N, EBRAHIMI T. PointXR: a toolbox for visualization and subjective evaluation of point clouds in virtual reality[C]//2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX) . Athlone: IEEE, 2020: 1-6.
[2] CHEN Y Q, WANG Q, CHEN H, et al. An overview of augmented reality technology[J]. Journal of Physics: Conference Series, 2019, 1237(2): 022082.
[3] AKSOY E E, BACI S, CAVDAR S. Salsanet: fast road and vehicle segmentation in lidar point clouds for autonomous driving[C]//2020 IEEE Intelligent Vehicles Symposium (IV) . Las Vegas: IEEE, 2020: 926-932.
[4] LI Y, MA L, ZHONG Z, et al. Deep learning for lidar point clouds in autonomous driving: a review[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(8): 3412-3432.
[5] LI X, DU S, LI G, et al. Integrate point-cloud segmentation with 3D LiDAR scan-matching for mobile robot localization and mapping[J]. Sensors, 2019, 20(1): 237.
[6] YANG L, LIU Y, PENG J, et al. A novel system for off-line 3D seam extraction and path planning based on point cloud segmentation for arc welding robot[J]. Robotics and Computer-Integrated Manufacturing, 2020, 64: 101929.
[7] GUO Y, WANG H, HU Q, et al. Deep learning for 3D point clouds: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(12): 4338-4364.
[8] LUO C, YANG X, YUILLE A. Exploring simple 3D multi-object tracking for autonomous driving[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) . Montreal: IEEE, 2021: 10488-10497.
[9] ZHAO L, CAI D, SHENG L, et al. 3DVG-Transformer: relation modeling for visual grounding on point clouds[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) . Montreal: IEEE, 2021: 2928-2937.
[10] QI C, SU H, MO K, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Honolulu: IEEE, 2017: 652-660.
[11] QI C, SU H, MO K, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[J]. Advances in Neural Information Processing Systems, 2017, 30: 5105-5114.
[12] LI Y, BU R, SUN M, et al. PointCNN: convolution on x-transformed points[J]. Advances in Neural Information Processing Systems, 2018, 31: 828-838.
[13] YAN X, ZHENG C, LI Z, et al. PointASNL: robust point clouds processing using nonlocal neural networks with adaptive sampling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Seattle: IEEE, 2020: 5589-5598.
[14] WU W, QI Z, FUXIN L. PointConv: deep convolutional networks on 3D point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach: IEEE, 2019: 9621-9630.
[15] QIAN G, LI Y, PENG H, et al. PointNeXt: revisiting pointnet++ with improved training and scaling strategies[J]. Advances in Neural Information Processing Systems, 2022, 35: 23192-23204.
[16] ZHAO H, JIANG L, JIA J, et al. Point transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) . Montreal: IEEE, 2021: 16259-16268.
[17] SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach: IEEE, 2019: 5693-5703.
[18] ZHANG G, LI Z, TANG C, et al. CEDNet: a cascade encoder-decoder network for dense prediction[EB/OL]. arxiv: 2302.06052(2023-02-06) [2023-10-31]. https://arxiv.org/abs/2302.06052.
[19] WANG J, SUN K, CHENG T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(10): 3349-3364.
[20] YUAN Y, FU R, HUANG L, et al. HRformer: high-resolution transformer for dense prediction[EB/OL]. arxiv: 2110.09408(2021-10-09) [2021-11-07]. https://arxiv.org/abs/2110.09408.
[21] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 6000-6010.
[22] ZHANG G, GE Y, DONG Z, et al. Deep high-resolution representation learning for cross-resolution person re-identification[J]. IEEE Transactions on Image Processing, 2021, 30: 8913-8925.
[23] WU X, LAO Y, JIANG L, et al. Point transformer v2: grouped vector attention and partition-based pooling[J]. Advances in Neural Information Processing Systems, 2022, 35: 33330-33342.
[24] GUO M, CAI J, LIU Z, et al. PCT: point cloud transformer[J]. Computational Visual Media, 2021, 7: 187-199.
[25] THOMAS H, QI C, DESCHAUD J, et al. KPConv: flexible and deformable convolution for point clouds[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) . Seoul: IEEEE, 2019: 6411-6420.
[26] WANG Y, SUN Y, LIU Z, et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 2019, 38(5): 1-12.
[27] MA X, QIN C, YOU H, et al. Rethinking network design and local geometry in point cloud: a simple residual MLP framework[EB/OL]. arxiv: 2202.07123(2022-02-07) [2022-11-29]. https://arxiv.org/abs/2202.07123v2.
[28] LIU X, HAN Z, LIU Y, et al. Point2sequence: learning the shape representation of 3D point clouds with an attention-based sequence to sequence network[C]//Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) . Hawaii: AAAI, 2019, 33(01): 8778-8785.
[29] ATZMON M, MARON H, LIPMANY. Point convolutional neural networks by extension operators[EB/OL]. arxiv: 1803.10091(2018-03-10) [2018-03-27]. https://arxiv.org/abs/1803.10091.
[30] LIU Y, FAN B, XIANG S, et al. Relation-shape convolutional neural network for point cloud analysis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Long Beach: IEEE, 2019: 8895-8904.
[31] SU H, JAMPANI V, SUN D, et al. SplatNet: sparse lattice networks for point cloud processing[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Salt Lake City: IEEE, 2018: 2530-2539.
[32] XU Y, FAN T, XU M, et al. SpiderCNN: deep learning on point sets with parameterized convolutional filters[C]//Proceedings of the European Conference on Computer Vision (ECCV) . Munich: CVF, 2018: 87-102.
[33] LI Z, GAO P, YUAN H, et al. Exploiting inductive bias in transformer for point cloud classification and segmentation[C]//2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) . Brisbane: IEEE, 2023: 140-145.
[1] 涂泽良, 程良伦, 黄国恒. 基于局部正交特征融合的小样本图像分类[J]. 广东工业大学学报, 2024, 41(02): 73-83.
[2] 熊荣盛, 王帮海, 杨夏宁. 基于蓝图可分离残差蒸馏网络的图像超分辨率重建[J]. 广东工业大学学报, 2024, 41(02): 65-72.
[3] 甘孟坤, 曾安, 张小波. 基于Swin-Unet的主动脉再缩窄预测研究[J]. 广东工业大学学报, 2023, 40(05): 34-40.
[4] 陈晓荣, 杨雪荣, 成思源, 刘国栋. 基于改进Unet网络的锂电池极片表面缺陷检测[J]. 广东工业大学学报, 2023, 40(04): 60-66,93.
[5] 赖东升, 冯开平, 罗立宏. 基于多特征融合的表情识别算法[J]. 广东工业大学学报, 2023, 40(03): 10-16.
[6] 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[7] 刘信宏, 苏成悦, 陈静, 徐胜, 罗文骏, 李艺洪, 刘拔. 高分辨率桥梁裂缝图像实时检测[J]. 广东工业大学学报, 2022, 39(06): 73-79.
[8] 杨运龙, 梁路, 滕少华. 一种双路网络语义分割模型[J]. 广东工业大学学报, 2022, 39(01): 63-70.
[9] 黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58.
[10] 高俊艳, 刘文印, 杨振国. 结合注意力与特征融合的目标跟踪[J]. 广东工业大学学报, 2019, 36(04): 18-23.
[11] 孙伟, 钟映春, 谭志, 连伟烯. 多特征融合的室内场景分类研究[J]. 广东工业大学学报, 2015, 32(1): 75-79.
Viewed
Full text
38
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 38 0

  From Others local
  Times 15 23
  Rate 39% 61%

Abstract
82
Just accepted Online first Issue
0 82 0
  From local
  Times 82
  Rate 100%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!