语义引导下自适应拓扑推理图卷积网络的人体动作识别

林哲煌; 李东

doi:10.12052/gdutxb.220107

语义引导下自适应拓扑推理图卷积网络的人体动作识别

林哲煌,
李东

Semantics-guided Adaptive Topology Inference Graph Convolutional Networks for Skeleton-based Action Recognition

摘要

摘要: 图卷积网络(Graph Convolutional Networks, GCN) 对于基于骨架关节点信息的人体动作识别任务具有天然的优势，越来越受到重视。图卷积网络的关键在于如何获取更丰富的特征信息以及采用更合理的拓扑结构。本文改进了人体骨架关节点及其语义信息(关节点类型和帧间索引)的特征融合方式，集成为一个语义信息编码模块，从而更适用于复杂的多层网络。在语义信息编码模块的语义引导下，网络可以获取更丰富的关节点特征信息。其次，本文提出了一种拓扑结构推理网络，结合卷积神经网络(Convolutional Neural Networks，CNN) 高效的特征学习能力，自适应地根据不同动作样本的上下文特征信息学习不同的邻接矩阵，有助于网络摆脱固定拓扑结构的局限性。将上述方法应用于双流自适应图卷积网络，本文提出了一种语义引导下多流自适应拓扑推理的图卷积网络。实验结果证明，本文的方法使图卷积网络识别精度有了明显的提高，在基于骨架信息的人体动作识别大型数据集NTU RGB+D、NTU RGB+D 120上均达到了目前先进水平。

Abstract: Graph convolutional networks (GCN), with natural advantages for skeleton-based action recognition, has attracted more and more attention. The key lies in how to obtain richer feature information and the design of the skeleton topology. In this research, the feature fusion method of joint and semantics (joint type and frame index) is improved, and integrated into a Semantics Coding Module (SCM), which is more applicable for complex multi-layer networks. Guided by the SCM, the network can obtain more feature information of skeleton. Secondly, a skeleton Topology Inference Network (TIN) is proposed, which adaptively learns different adjacency matrices according to the context information of different samples with the efficient feature learning ability of CNN, so that the network can get rid of the limitation of fixed topology. By applying the SCM and TIN to 2s-AGCN, we propose a semantics-guided multi-stream adaptive topology inference graph convolutional network for skeleton-based action recognition. Extensive experiments on datasets, NTU RGB+D and NTU RGB+D 120, demonstrate that our methods obviously improve the accuracy of network and our model has achieved the state-of-the-art performance.

HTML全文

参考文献(26)

施引文献

资源附件(0)