广东工业大学学报 ›› 2023, Vol. 40 ›› Issue (04): 45-52.doi: 10.12052/gdutxb.220107
林哲煌, 李东
Lin Zhe-huang, Li Dong
摘要: 图卷积网络(Graph Convolutional Networks, GCN) 对于基于骨架关节点信息的人体动作识别任务具有天然的优势,越来越受到重视。图卷积网络的关键在于如何获取更丰富的特征信息以及采用更合理的拓扑结构。本文改进了人体骨架关节点及其语义信息(关节点类型和帧间索引)的特征融合方式,集成为一个语义信息编码模块,从而更适用于复杂的多层网络。在语义信息编码模块的语义引导下,网络可以获取更丰富的关节点特征信息。其次,本文提出了一种拓扑结构推理网络,结合卷积神经网络(Convolutional Neural Networks,CNN) 高效的特征学习能力,自适应地根据不同动作样本的上下文特征信息学习不同的邻接矩阵,有助于网络摆脱固定拓扑结构的局限性。将上述方法应用于双流自适应图卷积网络,本文提出了一种语义引导下多流自适应拓扑推理的图卷积网络。实验结果证明,本文的方法使图卷积网络识别精度有了明显的提高,在基于骨架信息的人体动作识别大型数据集NTU RGB+D、NTU RGB+D 120上均达到了目前先进水平。
中图分类号:
[1] EVANGELIDIS G, SINGH G, HORAUD R. Skeletal quads: human action recognition using joint quadruples[C]//201422nd International Conference on Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2014: 4513-4518. [2] 周小平, 郭开仲. 基于计算机视觉的腾空飞脚错误动作识别模型[J]. 广东工业大学学报, 2012, 29(4): 14-17.ZHOU X P, GUO K Z. The model for the recognition of flying kick error action based on computer vision[J]. Journal of Guangdong University of Technology, 2012, 29(4): 14-17. [3] WANG H, SCHMID C. Action recognition with improved trajectories[C]//Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE, 2013: 3551-3558. [4] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3): 273-297. [5] SADEGH A M, SADAT S F, SALZMANN M, et al. Encouraging lstms to anticipate actions very early[C]//Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE, 2017: 280-289. [6] JAIN A, SINGH A, KOPPULA H S, et al. Recurrent neural networks for driver activity anticipation via sensory-fusion architecture[C]//2016 IEEE International Conference on Robotics and Automation (ICRA). New York: IEEE, 2016: 3118-3125. [7] WANG H, WANG L. Beyond joints: learning representations from primitive geometries for skeleton-based action recognition and detection[J]. IEEE Transactions on Image Processing, 2018, 27(9): 4382-4394. [8] YANG C, XU Y, SHI J, et al. Temporal pyramid network for action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2020: 591-600. [9] WANG L, TONG Z, JI B, et al. TDN: temporal difference networks for efficient action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2021: 1895-1904. [10] LI C, ZHONG Q, XIE D, et al. Skeleton-based action recognition with convolutional neural networks[C]//2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). New York: IEEE, 2017: 597-600. [11] WENG J, LIU M, JIANG X, et al. Deformable pose traversal convolution for 3D action and gesture recognition[C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham, Switzerland: Springer, 2018: 136-152. [12] KE Q, BENNAMOUN M, AN S, et al. Learning clip representations for skeleton-based 3d action recognition[J]. IEEE Transactions on Image Processing, 2018, 27(6): 2842-2855. [13] LIU J, WANG G, DUAN L Y, et al. Skeleton-based human action recognition with global context-aware attention LSTM networks[J]. IEEE Transactions on Image Processing, 2017, 27(4): 1586-1599. [14] SI C, JING Y, WANG W, et al. Skeleton-based action recognition with spatial reasoning and temporal stack learning[C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham, Switzerland: Springer, 2018: 103-118. [15] LI S, LI W, COOK C, et al. Independently recurrent neural network (indrnn): building a longer and deeper rnn[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 5457-5466. [16] CHEN T, ZHOU D, WANG J, et al. Learning multi-granular spatio-temporal graph network for skeleton-based action recognition[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2021: 4334-4342. [17] YAN S, XIONG Y, LIN D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2018, 32(1). [18] SHI L, ZHANG Y, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2019: 12026-12035. [19] ZHANG P, LAN C, ZENG W, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2020: 1112-1121. [20] SONG Y F, ZHANG Z, SHAN C, et al. Richly activated graph convolutional network for robust skeleton-based action recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(5): 1915-1925. [21] ZENG A, SUN X, YANG L, et al. Learning skeletal graph neural networks for hard 3D pose estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. New York: IEEE, 2021: 11436-11445. [22] LI S, YI J, FARHA Y A, et al. Pose refinement graph convolutional network for skeleton-based action recognition[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 1028-1035. [23] YANG H, GU Y, ZHU J, et al. PGCN-TCA: pseudo graph convolutional network with temporal and channel-wise attention for skeleton-based action recognition[J]. IEEE Access, 2020, 8: 10040-10047. [24] DING X, YANG K, CHEN W. A semantics-guided graph convolutional network for skeleton-based action recognition[C]//Proceedings of the 2020 the 4th International Conference on Innovation in Artificial Intelligence. New York: Association for Computing Machinery, 2020: 130-136. [25] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society, 2016: 770-778. [26] SHI L, ZHANG Y, CHENG J, et al. Skeleton-based action recognition with multi-stream adaptive graph convolutional networks[J]. IEEE Transactions on Image Processing, 2020, 29: 9532-9545. |
[1] | 魏均斌; . 几类图变换及其特征值[J]. 广东工业大学学报, 2004, 21(2): 93-96. |
|