Journal of Guangdong University of Technology ›› 2023, Vol. 40 ›› Issue (04): 24-30,36.doi: 10.12052/gdutxb.220018

• Computer Science and Technology • Previous Articles     Next Articles

Knowledge Distillation Method Based on Incremental Class Activation Knowledge

Zhang Jia-yue, Zhang Ling   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2022-01-30 Online:2023-07-25 Published:2023-08-02

Abstract: Due to the fact that features are not category-deterministic and the equipment resources are usually limit to support the category structure learning of samples, existing knowledge distillation methods possibly ignore the category knowledge distillation of samples. Therefore, this paper proposes a distillation method based on incremental class activation knowledge (ICAKD). First, this paper uses the class activation gradient map to extract class-discriminative sample features and proposes a class-activation constraint loss. Then, an incremental memory bank is built to store class-deterministic features, and multiple training batch samples are saved and updated iteratively. Finally, our proposed method calculates the quasi-quality center of the samples in the memory bank to construct the category structure relationship, and further performs the category knowledge distillation according to the class-activation constraint and the category structure relationship. Experimental results on the Cifar10, Cifar100, Tiny-ImageNet, and ImageNet datasets show that the proposed method achieves a 0.4%~1.21% improvement in term of accuracy when compared with the Category Structure Knowledge Distillation(CSKD) methods, demonstrating the promising effectiveness of the characteristics and increment of category judgment for category knowledge distillation.

Key words: knowledge distillation, category activation knowledge, incremental memory bank, category structure

CLC Number: 

  • TP391
[1] ZAGORUYKO S, KOMODAKIS N. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer[EB/OL]. arXiv: 1612.03928(2017-02-12)[2022-01-30].https://doi.org/10.48550/arXiv.1612.03928.
[2] CHEN Z, ZHENG X, SHEN H, et al. Improving knowledge distillation via category structure[C]//16th European Conference on Computer Vision. Glasgow: Springer, 2020: 205-219.
[3] HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network[EB/OL]. arXiv: 1503.02531(2015-03-09)[2022-01-30]. https://doi.org/10.48550/arXiv.1503.02531.
[4] MÜLLER R, KORNBLITH S, HINTON G. When does label smoothing help?[C]//Annual Conference on Neural Information Processing Systems 2019. Vancouver: MIT Press, 2019: 4694-4703.
[5] DING Q, WU S, SUN H, et al. Adaptive regularization of labels[EB/OL]. arXiv: 1908.05474(2019-08-15)[2022-01-30].https://doi.org/10.48550/arXiv.1908.05474.
[6] ROMERO A, BALLAS N, KAHOU S E, et al. FitNets: hints for thin deep nets[EB/OL]. arXiv: 1412.6550(2015-03-27)[2022-01-30].https://doi.org/10.48550/arXiv.1412.6550.
[7] JANG Y, LEE H, HWANG S J, et al. Learning what and where to transfer[C]//Proceedings of the 36th International Conference on Machine Learning. Long Beach: PMLR, 2019: 3030-3039.
[8] HUANG Z, WANG N. Like what you like: knowledge distill via neuron selectivity transfer[EB/OL]. arXiv: 1707.012197(2017-12-18)[2022-01-30].https://doi.org/10.48550/arXiv.1707.01219.
[9] WANG K, GAO X, ZHAO Y, et al. Pay attention to features, transfer learn faster CNNs[EB/OL]. (2019-09-26)[2022-01-30].https://openreview.net/forum?id=ryxyCeHtPB.
[10] PARK W, KIM D, LU Y, et al. Relational knowledge distillation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019: 3967-3976.
[11] PENG B, JIN X, LIU J, et al. Correlation congruence for knowledge distillation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019: 5007-5016.
[12] LIU Y, CAO J, LI B, et al. Knowledge distillation via instance relationship graph[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2019: 7096-7104.
[13] LI X, WU J, FANG H, et al. Local correlation consistency for knowledge distillation[C]//16th European Conference on Computer Vision. Glasgow: Springer, 2020: 18-33.
[14] TIAN Y, KRISHNAN D, ISOLA P. Contrastive representation distillation[EB/OL]. arXiv: 1910.10699(2022-01-24)[2022-01-30].https://doi.org/10.48550/arXiv.1910.10699.
[15] CHEN D, MEI J P, ZHANG Y, et al. Cross-layer distillation with semantic calibration[C]//Thirty-Fifth AAAI Conference on Artificial Intelligence. Virtual Event: AAAI, 2021, 35(8): 7028-7036.
[16] YUN S, PARK J, LEE K, et al. Regularizing class-wise predictions via self-knowledge distillation[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle: IEEE, 2020: 13873-13882.
[17] ZEILER M D, TAYLOR G W, FERGUS R. Adaptive deconvolutional networks for mid and high level feature learning[C]//2011 International Conference on Computer Vision. Barcelona: IEEE, 2011: 2018-2025.
[18] ZHOU B, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 2921-2929.
[19] SELVARAJU R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128(2): 336-359.
[20] LI K, WU Z, PENG K C, et al. Tell me where to look: guided attention inference network[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 9215-9223.
[21] VAN O A, KALCHBRENNER N, KAVUKCUOGLU K. Pixel recurrent neural networks[C]//Proceedings of the 33nd International Conference on Machine Learning. New York: JMLR. org, 2016: 1747-1756.
[1] Dai Bin, Zeng Bi, Wei Peng-fei, Huang Yong-jian. A Task-oriented Dialogue Policy Learning Method of Improved Discriminative Deep Dyna-Q [J]. Journal of Guangdong University of Technology, 2023, 40(04): 9-17,23.
[2] Zhong Geng-jun, Li Dong. A Channel-splited Based Dual-branch Block for 3D Point Cloud Processing [J]. Journal of Guangdong University of Technology, 2023, 40(04): 18-23.
[3] Wu Ya-di, Chen Ping-hua. A Music Recommendation Model Based on Users' Long and Short Term Preferences and Music Emotional Attention [J]. Journal of Guangdong University of Technology, 2023, 40(04): 37-44.
[4] Lin Zhe-huang, Li Dong. Semantics-guided Adaptive Topology Inference Graph Convolutional Networks for Skeleton-based Action Recognition [J]. Journal of Guangdong University of Technology, 2023, 40(04): 45-52.
[5] Huang Xiao-yong, Li Wei-tong. Fall Detection Algorithm Based on TSSI and STB-CNN [J]. Journal of Guangdong University of Technology, 2023, 40(04): 53-59.
[6] Chen Xiao-rong, Yang Xue-rong, Cheng Si-yuan, Liu Guo-dong. Surface Defect Detection of Lithium Battery Electrodes Based on Improved Unet Network [J]. Journal of Guangdong University of Technology, 2023, 40(04): 60-66,93.
[7] Cao Zhi-xiong, Wu Xiao-ling, Luo Xiao-wei, Ling Jie. Helmet Wearing Detection Algorithm Intergrating Transfer Learning and YOLOv5 [J]. Journal of Guangdong University of Technology, 2023, 40(04): 67-76.
[8] Lai Dong-sheng, Feng Kai-ping, Luo Li-hong. Facial Expression Recognition Based on Multi-feature Fusion [J]. Journal of Guangdong University of Technology, 2023, 40(03): 10-16.
[9] Xie Guo-bo, Lin Li, Lin Zhi-yi, He Di-xuan, Wen Gang. An Insulator Burst Defect Detection Method Based on YOLOv4-MP [J]. Journal of Guangdong University of Technology, 2023, 40(02): 15-21.
[10] Chen Jing-yu, Lyu Yi. Frost Detection Method of Cold Chain Refrigerating Machine Based on Spiking Neural Network [J]. Journal of Guangdong University of Technology, 2023, 40(01): 29-38.
[11] Ye Wen-quan, Li Si, Ling Jie. Sparse-view SPECT Image Reconstruction Based on Multilevel-residual U-Net [J]. Journal of Guangdong University of Technology, 2023, 40(01): 61-67.
[12] Zou Heng, Gao Jun-li, Zhang Shu-wen, Song Hai-tao. Design and Implementation of a Dropping Guidance Device for Go Robot [J]. Journal of Guangdong University of Technology, 2023, 40(01): 77-82,91.
[13] Xie Guang-qiang, Xu Hao-ran, Li Yang, Chen Guang-fu. Consensus Opinion Enhancement in Social Network with Multi-agent Reinforcement Learning [J]. Journal of Guangdong University of Technology, 2022, 39(06): 36-43.
[14] Liu Xin-hong, Su Cheng-yue, Chen Jing, Xu Sheng, Luo Wen-jun, Li Yi-hong, Liu Ba. Real Time Detection of High Resolution Bridge Crack Image [J]. Journal of Guangdong University of Technology, 2022, 39(06): 73-79.
[15] Xiong Wu, Liu Yi. Application of Particle Filter Algorithm in Static Deformation Monitoring of BDS High-Speed Rail [J]. Journal of Guangdong University of Technology, 2022, 39(04): 66-72.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!