基于热力图引导和注意力机制的六自由度抓取检测方法

林洪宇; 高军礼

doi:10.12052/gdutxb.250202

基于热力图引导和注意力机制的六自由度抓取检测方法

6−DoF Grasp Detection Method Based on Heatmap Guidance and Attention Mechanism

摘要

摘要: 现有六自由度抓取检测方法普遍聚焦于提升抓取精度，在提升网络推理效率方面仍存在不足，难以满足机器人在杂乱场景中的实时抓取需求。为此，提出一种基于热力图引导和注意力机制的六自由度抓取检测方法。该方法通过多通道热力图引导网络快速定位高潜力抓取区域，显著缩减点云的无效处理范围，提升计算效率；同时，引入轻量级双重注意力门控模块，协同增强关键特征提取并抑制背景噪声；此外，设计轻量化局部特征提取与融合模块对齐和深度融合二维图像特征和三维点云特征，增强特征表达的鲁棒性；最后，通过基于锚点动态偏移算法的抓取姿态生成器自适应优化锚点分布，以拟合非均匀抓取姿态真值，生成密集精确的六自由度抓取姿态。在GraspNet−1Billion数据集上的实验结果表明，本文方法的抓取精度较基线方法GSNet提升了4.28个百分点，平均推理时间缩短至39 ms (仅为GSNet的20%)，在有效提升推理效率的同时保证了较高的推理精度。真实机器人实验中，抓取成功率较GSNet提升8.57个百分点，由此验证了该方法在实际场景中的有效性。

Abstract: Existing six-degree-of-freedom (6-DoF) grasp detection methods predominantly focus on improving grasp detection accuracy, yet they remain weak in enhancing network inference efficiency, making it difficult to meet the real-time grasping demands of robots in cluttered environments. To address this issue, a 6-DoF grasp detection method based on heatmap guidance and attention mechanism is proposed. The method guides the network to rapidly locate high-potential grasping regions through multi-channel heatmaps, significantly reducing redundant processing scope of point clouds and improving computational efficiency. Simultaneously, a lightweight dual attention gate module is designed to synergistically enhance key feature extraction while suppressing background noise. Furthermore, a lightweight local feature extraction and fusion module is designed to align and deeply integrate 2D image features with 3D point cloud features, enhancing the robustness of feature representation. Finally, a grasp pose generator with an anchor dynamic offset algorithm adaptively optimizes anchor distributions to better fit non-uniform distribution of ground-truth grasp poses, thereby producing dense and accurate 6-DoF grasp poses. Experimental results on the GraspNet-1Billion dataset demonstrate that the proposed method achieves a 4.28 percentage point improvement in grasping accuracy over the baseline GSNet, while reducing the average inference time to 39ms (only 20% of GSNet) . This effectively enhances inference efficiency while maintaining high accuracy. In real-world robot experiments, the grasping success rate surpasses GSNet by 8.57 percentage points, thereby validating the method’s effectiveness in practical scenarios.

HTML全文

参考文献(33)

施引文献

资源附件(0)