Journal of Guangdong University of Technology ›› 2023, Vol. 40 ›› Issue (04): 45-52.doi: 10.12052/gdutxb.220107

• Computer Science and Technology • Previous Articles     Next Articles

Semantics-guided Adaptive Topology Inference Graph Convolutional Networks for Skeleton-based Action Recognition

Lin Zhe-huang, Li Dong   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2022-06-13 Online:2023-07-25 Published:2023-08-02

Abstract: Graph convolutional networks (GCN), with natural advantages for skeleton-based action recognition, has attracted more and more attention. The key lies in how to obtain richer feature information and the design of the skeleton topology. In this research, the feature fusion method of joint and semantics (joint type and frame index) is improved, and integrated into a Semantics Coding Module (SCM), which is more applicable for complex multi-layer networks. Guided by the SCM, the network can obtain more feature information of skeleton. Secondly, a skeleton Topology Inference Network (TIN) is proposed, which adaptively learns different adjacency matrices according to the context information of different samples with the efficient feature learning ability of CNN, so that the network can get rid of the limitation of fixed topology. By applying the SCM and TIN to 2s-AGCN, we propose a semantics-guided multi-stream adaptive topology inference graph convolutional network for skeleton-based action recognition. Extensive experiments on datasets, NTU RGB+D and NTU RGB+D 120, demonstrate that our methods obviously improve the accuracy of network and our model has achieved the state-of-the-art performance.

Key words: action recognition, graph convolutional network, skeleton, adjacency matrix

CLC Number: 

  • TP391.4
[1] EVANGELIDIS G, SINGH G, HORAUD R. Skeletal quads: human action recognition using joint quadruples[C]//201422nd International Conference on Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2014: 4513-4518.
[2] 周小平, 郭开仲. 基于计算机视觉的腾空飞脚错误动作识别模型[J]. 广东工业大学学报, 2012, 29(4): 14-17.ZHOU X P, GUO K Z. The model for the recognition of flying kick error action based on computer vision[J]. Journal of Guangdong University of Technology, 2012, 29(4): 14-17.
[3] WANG H, SCHMID C. Action recognition with improved trajectories[C]//Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE, 2013: 3551-3558.
[4] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3): 273-297.
[5] SADEGH A M, SADAT S F, SALZMANN M, et al. Encouraging lstms to anticipate actions very early[C]//Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE, 2017: 280-289.
[6] JAIN A, SINGH A, KOPPULA H S, et al. Recurrent neural networks for driver activity anticipation via sensory-fusion architecture[C]//2016 IEEE International Conference on Robotics and Automation (ICRA). New York: IEEE, 2016: 3118-3125.
[7] WANG H, WANG L. Beyond joints: learning representations from primitive geometries for skeleton-based action recognition and detection[J]. IEEE Transactions on Image Processing, 2018, 27(9): 4382-4394.
[8] YANG C, XU Y, SHI J, et al. Temporal pyramid network for action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2020: 591-600.
[9] WANG L, TONG Z, JI B, et al. TDN: temporal difference networks for efficient action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos, CA: IEEE Computer Society, 2021: 1895-1904.
[10] LI C, ZHONG Q, XIE D, et al. Skeleton-based action recognition with convolutional neural networks[C]//2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). New York: IEEE, 2017: 597-600.
[11] WENG J, LIU M, JIANG X, et al. Deformable pose traversal convolution for 3D action and gesture recognition[C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham, Switzerland: Springer, 2018: 136-152.
[12] KE Q, BENNAMOUN M, AN S, et al. Learning clip representations for skeleton-based 3d action recognition[J]. IEEE Transactions on Image Processing, 2018, 27(6): 2842-2855.
[13] LIU J, WANG G, DUAN L Y, et al. Skeleton-based human action recognition with global context-aware attention LSTM networks[J]. IEEE Transactions on Image Processing, 2017, 27(4): 1586-1599.
[14] SI C, JING Y, WANG W, et al. Skeleton-based action recognition with spatial reasoning and temporal stack learning[C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham, Switzerland: Springer, 2018: 103-118.
[15] LI S, LI W, COOK C, et al. Independently recurrent neural network (indrnn): building a longer and deeper rnn[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 5457-5466.
[16] CHEN T, ZHOU D, WANG J, et al. Learning multi-granular spatio-temporal graph network for skeleton-based action recognition[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2021: 4334-4342.
[17] YAN S, XIONG Y, LIN D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2018, 32(1).
[18] SHI L, ZHANG Y, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2019: 12026-12035.
[19] ZHANG P, LAN C, ZENG W, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2020: 1112-1121.
[20] SONG Y F, ZHANG Z, SHAN C, et al. Richly activated graph convolutional network for robust skeleton-based action recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(5): 1915-1925.
[21] ZENG A, SUN X, YANG L, et al. Learning skeletal graph neural networks for hard 3D pose estimation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. New York: IEEE, 2021: 11436-11445.
[22] LI S, YI J, FARHA Y A, et al. Pose refinement graph convolutional network for skeleton-based action recognition[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 1028-1035.
[23] YANG H, GU Y, ZHU J, et al. PGCN-TCA: pseudo graph convolutional network with temporal and channel-wise attention for skeleton-based action recognition[J]. IEEE Access, 2020, 8: 10040-10047.
[24] DING X, YANG K, CHEN W. A semantics-guided graph convolutional network for skeleton-based action recognition[C]//Proceedings of the 2020 the 4th International Conference on Innovation in Artificial Intelligence. New York: Association for Computing Machinery, 2020: 130-136.
[25] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society, 2016: 770-778.
[26] SHI L, ZHANG Y, CHENG J, et al. Skeleton-based action recognition with multi-stream adaptive graph convolutional networks[J]. IEEE Transactions on Image Processing, 2020, 29: 9532-9545.
[1] Huang Xiao-yong, Li Wei-tong. Fall Detection Algorithm Based on TSSI and STB-CNN [J]. Journal of Guangdong University of Technology, 2023, 40(04): 53-59.
[2] Liu Yang, Peng Shi-guo, Ma Hong-zhi, Liao Wei-xin. Dynamic Parameter Identification and Gait Tracking of Lower Limb Exoskeleton Robot [J]. Journal of Guangdong University of Technology, 2022, 39(06): 44-52.
[3] WANG Wan-Xin, JIA Li-Feng. The Method of Removing Burrs in Skeleton Extraction [J]. Journal of Guangdong University of Technology, 2014, 31(4): 90-94.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!