基于可分离Transformer的点云分类方法

刘诚辉; 李光平

doi:10.12052/gdutxb.240003

基于可分离Transformer的点云分类方法

Point Cloud Classification Based on Separable Transformer

摘要

摘要: Transformer往往利用捕获远程依赖项的优势来提取点云远程点的关系交互，忽略重要的局部结构细节，并且依赖于大量的计算成本来实现高性能，导致计算负担增加。为了缓解这个问题，本文借鉴可分离视觉Transformer的思想，提出了一种可分离的Transformer点云分类方法，简称Sep-point。Sep-point通过深度可分离的自注意，帮助在点云组内和组间按顺序进行局部−全局关系交互。采用新的位置令牌嵌入和分组自注意方法，分别以可忽略的代价计算组间的注意关系，并建立跨多个区域的远程信息交互。提取局部−全局特征的同时，大大减少了计算负担。实验结果表明提出的Sep-point在数据集ModelNet40上比现有的PCT(Point Cloud Transformer)分类精度提升了0.2%，在真实数据集ScanObjectNN上的分类精度提升了6.3%。同时网络参数量和FLOPS指标分别降低了0.72M和0.18G，充分验证了本文方法的有效性。

Abstract: Transformer tends to take advantage of capturing remote dependencies to extract relational interactions at remote points of the point cloud, ignoring important local structural details, and achieves high performance by significantly increasing the computational burden. To alleviate this problem, we propose a separable Transformer point cloud classification method, named Sep-point, based on the idea of separable visual Transformer. The proposed Sep-point facilitates sequential local-global relational interactions within and between groups of point clouds through depth-separable self-attention. New location token embedding and group self-attention methods are used to compute inter-group attentional relationships with negligible computational cost and to establish telematic interactions across multiple regions, respectively. In this way, the local-global features are extracted while the computational burden is significantly reduced. Experimental results show that the proposed Sep-point improves the classification accuracy by 0.2% on the ModelNet40 dataset over the existing PCT (Point Cloud Transformer) and by 6.3% on the real ScanObjectNN dataset, respectively. Moreover, the number of network parameters and FLOPS metrics are reduced by 0.72M and 0.18G, respectively. These experimental results clearly demonstrate the promising effectiveness of our proposed method.

HTML全文

参考文献(36)

施引文献

资源附件(0)