• •
刘诚辉, 李光平
Liu Cheng-hui, Li Guang-ping
摘要: Transformer往往利用捕获远程依赖项的优势来提取点云远程点的关系交互,忽略重要的局部结构细节,并且依赖于大量的计算成本来实现高性能,计算负担增加。为了缓解这个问题,本文借鉴可分离视觉Transformer的思想,提出了一种可分离的Transformer点云分类方法,简称Sep-point。Sep-point通过深度可分离的自注意,帮助在点云组内和组间按顺序进行局部–全局关系交互。采用新的位置令牌嵌入和分组自注意方法,分别以可忽略的代价计算组间的注意关系,并建立跨多个区域的远程信息交互。提取局部–全局特征的同时,大大减少了计算负担。实验结果表明提出的Sep-point在数据集ModelNet40上比现有的PCT(Point Cloud Transformer)分类精度提升了0.2%,在真实数据集ScanObjectNN上的分类精度提升了6.3%。同时网络参数量和FLOPS指标分别降低了0.72M和0.18G,充分验证了本文方法的有效性。
中图分类号:
[1] SHI S, WANG X, LI H. PointRCNN: 3D object proposal generation and detection from point cloud[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 770-779. [2] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: ACM, 2017: 6000-6010. [3] RAO Y, LU J , ZHOU J. Spherical fractal convolutional neural networks for point cloud recognition[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 452-460. [4] YI L, SU H, GUO X, et al. Syncspeccnn: synchronized spectral CNN for 3D shape segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6584-6592. [5] ZHANG Z, LI K, YIN X, et al. Point cloud semantic scene segmentation based on coordinate convolution [J]. Computer Animation and Virtual Worlds, 2020, 31(4-5): e1948. [6] SHI S, WANG Z, SHI J, et al. From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(8): 2647-2664. [7] PENG L, LIU F, YU Z, et al. Lidar point cloud guided monocular 3D object detection[C]//2022 European Conference on Computer Vision. Tel Aviv: Springer, 2022: 123-139. [8] YANG H, SHI J, CARLONE L. Teaser: fast and certifiable point cloud registration [J]. IEEE Transactions on Robotics, 2020, 37(2): 314-333. [9] GUO M, CAI J, LIU Z, et al. PCT: point cloud transformer [J]. Computational Visual Media, 2021, 7(2): 187-199. [10] ZHAO H, JIANG L, JIA J, et al. Point transformer [J]. IEEE Access, 2021, 9(3): 16259-16268. [11] LI W, WANG X, XIA X, et al. Sepvit: separable vision transformer[EB/OL]. arXiv: 2203.15380(2022-06-15) [2024-04-16]. https://doi.org/10.48550/arXiv.2203.15380. [12] WANG Z, LU F. VoxSegNet: volumetric CNNs for semantic part segmentation of 3D shapes [J]. IEEE Transactions on Visualization and Computer Graphics, 2019, 26(9): 2919-2930. [13] SHI S, GUO C, JIANG L, et al. PV-RCNN: point-voxel feature set abstraction for 3D object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10526-10535. [14] QI C, SU H, MO K, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 77-85. [15] QI C, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: ACM, 2017: 5105–5114. [16] LI Y, BU R, SUN M, et al. PointCNN: convolution on x-transformed points[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal: ACM, 2018: 828-838. [17] LIU X, HAN Z, LIU Y, et al. Point2sequence: learning the shape representation of 3D point clouds with an attention-based sequence to sequence network[C]//Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence. Honolulu: AAAI, 2019, 33: 8778–8785. [18] WANG Y, SUN Y, LIU Z, et al. Dynamic graph CNN for learning on point clouds [J]. ACM Transactions on Graphics, 2019, 38(5): 1-12. [19] WU W, QI Z, Li F. PointConv: deep convolutional networks on 3D point clouds[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 9613–9622. [20] THOMAS H, QI C, DESCHAUD J, et al. KPConv: Flexible and deformable convolution for point clouds[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 6410–6419. [21] XU M, DING R, ZHAO H, et al. PAConv: position adaptive convolution with dynamic kernel assembling on point clouds[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 3172–3181. [22] HOWARD A, ZHU M, CHEN B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications[EB/OL]. arXiv: 1704.04861(2017-04-17) [2024-04-16]. https://doi.org/10.48550/arXiv.1704.04861. [23] LANDRIEU L, BOUSSAHA M. Point cloud oversegmentation with graph-structured deep metric learning[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7432–7441. [24] WU Z, SONG S, KHOSLA A, et al. 3D ShapeNets: a deep representation for volumetric shapes[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1912–1920. [25] UY M, PHAM Q, HUA B, et al. Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 1588-1597. [26] XU Y, FAN T, XU M, et al. SpiderCNN: deep learning on point sets with parameterized convolutional filters[C]//Computer Vision-ECCV 2018. Munich: Springer, 2018: 99-105. [27] LIU Y, FAN B, XIANG S, et al. Relation-shape convolutional neural network for point cloud analysis[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 8887-8896. [28] YAN X, ZHENG C, LI Z, et al. PointASNL: robust point clouds processing using nonlocal neural networks with adaptive sampling[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 5588-5597. [29] BERG A, OSKARSSON M, CONNOR M. Points to patches: enabling the use of self-attention for 3D shape recognition[EB/OL]. arXiv: 2204.03957(2022-04-08) [2024-04-16]. https://doi.org/10.48550/arXiv.2204.03957. [30] WIJAYA K, PAEK D, KONG S. Advanced feature learning on point clouds using multi-resolution features and learnable pooling[EB/OL]. arXiv: 2205.09962(2022-05-20) [2024-04-16]. https://doi.org/10.48550/arXiv.2205.09962. [31] QIU S, ANWAR S, BARNES N. Dense-resolution network for point cloud classification and segmentation[C]//2021 IEEE Winter Conference on Applications of Computer Vision, Waikoloa: IEEE, 2021: 3812-3821. [32] QIU S, ANWAR S, BARNES N. Geometric back-projection network for point cloud classification [J]. IEEE Transactions on Multimedia, 2021, 24(3): 1943-1955. [33] GOYAL A, LAW H, LIU B, et al. Revisiting point cloud shape classification with a simple and effective baseline[C]//Proceedings of the 38th International Conference on Machine Learning. Vienna: IMLS, 2021: 3809-3820. [34] HAMDI A, GIANCOLA S, GHANEM B. MVTN: multi-view transformation network for 3D shape recognition[C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 1-11. [35] YU X, TANG L, RAO Y, et al. Point-BERT: pre-training 3D point cloud transformers with masked point modeling[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: CVPR, 2022: 19291-19300. [36] CHENG S, CHEN X, HE X, et al. PRA-Net: point relation-aware network for 3D point cloud analysis [J]. IEEE Transactions on Image Processing, 2021, 30(2): 4436-4448. |
No related articles found! |
|