广东工业大学学报

• •    

针对忆阻神经网络加速器的混合粒度剪枝方法研究

周博, 陈辉   

  1. 广东工业大学 计算机学院,广东 广州 510006
  • 收稿日期:2024-01-21 出版日期:2025-01-14 发布日期:2025-01-14
  • 通信作者: 陈辉(1974–) ,男,副研究员,硕士生导师,主要研究方向为控制工程,E-mail:chenhui02@gdut.edu.cn
  • 作者简介:周博(1998–) ,男,硕士研究生,主要研究方向为忆阻神经网络加速器,E-mail:bobasyu@163.com
  • 基金资助:
    国家自然科学基金面上项目 (62072118)

Research on Mixed-grained Pruning Method for Memristive Neural Network Accelerator

Zhou Bo, Chen Hui   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2024-01-21 Online:2025-01-14 Published:2025-01-14

摘要: 消除冗余以减少无效计算是加速神经网络和提高计算效率的常用方法。权重剪枝是一种常用的模型压缩方法,其通过去除冗余权重来有效降低计算成本。然而,现有的非结构化剪枝方法没有考虑阻性随机存取存储器 (Resistive Random Access Memory,RRAM) 的忆阻交叉阵列 (Memristive Crossbar Array,MCA) 结构;而结构化剪枝方法虽然契合MCA结构,但是其过粗的剪枝粒度容易造成网络精度的下降。本文提出了一种混合粒度剪枝方法,有效地降低了基于RRAM的忆阻神经网络加速器的硬件开销。该方法将权重子矩阵列根据冗余程度不同进行分类,并执行不同的剪枝策略,充分利用卷积神经网络 (Convolutional Neural Network,CNN) 的冗余性。与现有方法相比,该方法在压缩比和能量效率方面分别提高了2.0倍和1.6倍,并且精度损失更低。

关键词: 神经网络, 忆阻器, 阻性随机存取存储器, 模型剪枝

Abstract: Reducing redundant computations is a common method to accelerate neural networks and improve computational efficiency. The weight pruning is an effective model compression technique by removing redundant weights. However, most existing unstructured pruning methods do not consider the Resistive Random Access Memory (RRAM) crossbar structure of the memristors. On the contrary, the structured pruning methods fit well with the Memristive Crossbar Array (MCA) structure but may lead to a decrease in network accuracy due to the coarser pruning granularity. In this paper, we propose a mixed granularity pruning method that can effectively reduce the hardware overhead of the RRAM-based accelerators. The proposed method classifies the weight sub-matrix columns based on different levels of redundancy, and applies different pruning strategies for different columns, which makes full use of the redundancy of Convolutional Neural Networks (CNNs) . Compared to existing methods, the proposed method achieves compression ratio and energy efficiency improvements of approximately 2.0× and 1.6×, respectively, with less accuracy loss.

Key words: neural network, memristor, resistive random access memory, pruning

中图分类号: 

  • TP389.1
[1] MATSUO Y, LECUN Y, SAHANI M, et al. Deep learning, reinforcement learning, and world models[J]. Neural Networks, 2022, 152: 267-275.
[2] 李冰, 午康俊, 王晶等. 基于忆阻器的图卷积神经网络加速器设计[J]. 电子与信息学报, 2023, 45(1): 106-115.
LI B, WU K J, WANG J, et al. Design of graph convolutional network accelerator based on resistive random access memory[J]. Journal of Electronics & Information Technology, 2023, 45(1): 106-115.
[3] LECUN Y, JACKEL L D, BOSER B, et al. Handwritten digit recognition: applications of neural network chips and automatic learning[J]. IEEE Communications Magazine, 1989, 27(11): 41-46.
[4] SIKHA O K, BHARATH B. Vgg16-random fourier hybrid model for masked face recognition[J]. Soft Computing, 2022, 26(22): 12795-12810.
[5] JOARDAR B K, DOPPA J R, LI H, et al. Realprune: reram crossbar-aware lottery ticket pruning for cnns[J]. IEEE Transactions on Emerging Topics in Computing, 2022, 11(2): 303-317.
[6] YANG S, HE S, DUAN H, et al. Apq: automated dnn pruning and quantization for reram-based accelerators[J]. IEEE Transactions on Parallel and Distributed Systems, 2023, 34(9): 2498-2511.
[7] SHIN H, PARK R, LEE S Y, et al. Effective zero compression on reram-based sparse dnn accelerators[C]//Proceedings of the 59th ACM/IEEE Design Automation Conference. New York: Association for Computing Machinery, 2022: 949-954.
[8] 刘阳, 滕颖蕾, 牛涛, 等. 基于深度强化学习的滤波器剪枝方案[J]. 北京邮电大学学报, 2023, 46(3): 31-36.
LIU Y, TENG Y L, NIU T, et al. Filter pruning algorithm based on deep reinforcement learning[J]. Journal of Beijing University of Posts and Telecommunications, 2023, 46(3): 31-36.
[9] PENG Y, KIM K, WU F, et al. Structured pruning of self-supervised pre-trained models for speech recognition and understanding[C] //ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Rhodes Island: IEEE, 2023: 1-5.
[10] PENG J, LIU H, ZHAO Z, et al. Cmq: crossbar-aware neural network mixed-precision quantization via differentiable architecture search[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(11): 4124-4133.
[11] 吴昊. 深度学习数据并行训练中低秩分解梯度压缩算法的系统支持[D]. 合肥: 中国科学技术大学, 2023
[12] YANG X, YANG H, DOPPA J R, et al. Essence: exploiting structured stochastic gradient pruning for endurance-aware reram-bassed in-memory training systems[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 42(7): 2187-2199.
[13] PENG J, LI Z, LIU H, et al. Network pruning towards highly efficient rram accelerator[J]. IEEE Transactions on Nanotechnology, 2022, 21: 340-351.
[14] CHIOU C Y, LEE K T, HUANG C R, et al. Admm-srnet: alternating direction method of multipliers based sparse representation network for one-class classification[J]. IEEE Transactions on Image Processing, 2023, 32: 2843-2856.
[15] ZHU Z, SUN H, QIU K, et al. MNSIM 2.0: a behavior-level modeling tool for memristor-based neuromorphic computing systems[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems., 2023, 42(11): 4112-4125.
[16] HU M, STRACHAN J P, LI Z, et al. Dot-product engine for neuromorphic computing: programming 1t1m crossbar to accelerate matrix-vector multiplication[C]//Proceedings of the 53rd annual design automation conference. New York: Association for Computing Machinery. 2016: 1-6.
[17] MURALIMANOHAR N, BALASUBRAMONIAN R, JOIPPI N. Optimizing nuca organizations and wiring alternatives for large caches with CACTI 6.0[C]//40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007) . Chicago: IEEE, 2007: 3-14.
[18] SCHAK M, GEPPERTH A. Gesture mnist: a new free-hand gesture dataset[C]//International Conference on Artificial Neural Networks. Switzerland, Bristol: Springer Cham, 2022: 657-668.
[19] KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[J]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4): 1-60.
[20] ZHANG C, ZHANG H, TIAN F, et al. Research on sheep face recognition algorithm based on improved alexnet model[J]. Neural Computing and Applications, 2023, 35(36): 24971-24979.
[21] CHEN P, HE S, ZHANG X, et al. Accelerating tensor swapping in gpus with self-tuning compression[J]. IEEE Transactions on Parallel and Distributed Systems, 2022, 33(12): 4484-4498.
[22] CHU C, WANG Y, ZHAO Y, et al. Pim-prune: fine-grain dcnn pruning for crossbar-based process-in-memory architecture[C]//2020 57th ACM/IEEE Design Automation Conference (DAC) . San Francisco: IEEE, 2020: 1-6.
[23] YU S, LIU Y, ZHANG L, et al. High area/energy efficiency rram cnn accelerator with kernel-reordering weight mapping scheme based on pattern pruning[C]//2021 IEEE 10th Non-Volatile Memory Systems and Applications Symposium (NVMSA). Beijing: IEEE, 2021: 1-6.
[24] YANG S, CHEN W, ZHANG X, et al. Auto-prune: automated dnn pruning and mapping for reram-based accelerator[C]//Proceedings of the ACM International Conference on Supercomputing. New York: Association for Computing Machinery, 2021: 304-315.
[25] SHEN Z, WU J, JIANG X, et al. Prap-pim: a weight pattern reusing aware pruning method for reram-based pim dnn accelerators[J]. High-Confidence Computing, 2023, 3(2): 100123.
[1] 曾安, 王丹, 杨宝瑶, 张小波, 石镇维, 刘再毅, 潘丹. 基于Transformer与注意力机制的肺部肿瘤分割方法[J]. 广东工业大学学报, 2025, 0(0): 0-.
[2] 谢伟立, 张军. 一种基于多尺度的多层卷积稀疏编码网络[J]. 广东工业大学学报, 2024, 41(06): 125-132.
[3] 林浩, 陈平华. 基于因子级特征与属性偏好联合学习的会话推荐[J]. 广东工业大学学报, 2024, 41(06): 91-100.
[4] 郑侠聪, 程良伦, 黄国恒, 王敬超. 嵌入拓扑特征的自然场景文本检测方法[J]. 广东工业大学学报, 2024, 41(03): 102-109.
[5] 范俊宇, 宋立锋. 基于神经网络的HEVC帧内预测组合快速算法[J]. 广东工业大学学报, 2024, 41(03): 131-140.
[6] 殷丹丽, 凌捷. 基于异构信息网络的Android恶意程序检测方法[J]. 广东工业大学学报, 2024, 41(02): 56-64.
[7] 陈睿, 蔡念, 罗智浩, 刘璇, 黎剑. 基于多任务循环神经网络带状回归模型的乳腺癌个体生存分析[J]. 广东工业大学学报, 2024, 41(01): 34-40.
[8] 黄晓湧, 李伟彤. 基于TSSI和STB-CNN的跌倒检测算法[J]. 广东工业大学学报, 2023, 40(04): 53-59.
[9] 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[10] 张锐, 吕俊. 基于分离结果信噪比估计与自适应调频网络的单通道语音分离技术[J]. 广东工业大学学报, 2023, 40(02): 45-54.
[11] 邱俊豪, 程志键, 林国怀, 任鸿儒, 鲁仁全. 具有执行器故障的非线性系统指定性能控制[J]. 广东工业大学学报, 2023, 40(02): 55-63.
[12] 陈靖宇, 吕毅. 基于脉冲神经网络的冷链制冷机结霜检测方法[J]. 广东工业大学学报, 2023, 40(01): 29-38.
[13] 叶文权, 李斯, 凌捷. 基于多级残差U-Net的稀疏SPECT图像重建[J]. 广东工业大学学报, 2023, 40(01): 61-67.
[14] 彭美春, 阳晨, 李君平, 叶伟斌, 黄文伟. 基于BP神经网络的车辆碳排放测算研究[J]. 广东工业大学学报, 2023, 40(01): 107-112.
[15] 刘洪伟, 林伟振, 温展明, 陈燕君, 易闽琦. 基于MABM的消费者情感倾向识别模型——以电影评论为例[J]. 广东工业大学学报, 2022, 39(06): 1-9.
Viewed
Full text
116
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 116 0

  From Others local
  Times 11 105
  Rate 9% 91%

Abstract
52
Just accepted Online first Issue
0 52 0
  From local
  Times 52
  Rate 100%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!