Journal of Guangdong University of Technology

   

Research on Mixed-grained Pruning Method for Memristive Neural Network Accelerator

Zhou Bo, Chen Hui   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2024-01-21 Online:2025-01-14 Published:2025-01-14

Abstract: Reducing redundant computations is a common method to accelerate neural networks and improve computational efficiency. The weight pruning is an effective model compression technique by removing redundant weights. However, most existing unstructured pruning methods do not consider the Resistive Random Access Memory (RRAM) crossbar structure of the memristors. On the contrary, the structured pruning methods fit well with the Memristive Crossbar Array (MCA) structure but may lead to a decrease in network accuracy due to the coarser pruning granularity. In this paper, we propose a mixed granularity pruning method that can effectively reduce the hardware overhead of the RRAM-based accelerators. The proposed method classifies the weight sub-matrix columns based on different levels of redundancy, and applies different pruning strategies for different columns, which makes full use of the redundancy of Convolutional Neural Networks (CNNs) . Compared to existing methods, the proposed method achieves compression ratio and energy efficiency improvements of approximately 2.0× and 1.6×, respectively, with less accuracy loss.

Key words: neural network, memristor, resistive random access memory, pruning

CLC Number: 

  • TP389.1
[1] MATSUO Y, LECUN Y, SAHANI M, et al. Deep learning, reinforcement learning, and world models[J]. Neural Networks, 2022, 152: 267-275.
[2] 李冰, 午康俊, 王晶等. 基于忆阻器的图卷积神经网络加速器设计[J]. 电子与信息学报, 2023, 45(1): 106-115.
LI B, WU K J, WANG J, et al. Design of graph convolutional network accelerator based on resistive random access memory[J]. Journal of Electronics & Information Technology, 2023, 45(1): 106-115.
[3] LECUN Y, JACKEL L D, BOSER B, et al. Handwritten digit recognition: applications of neural network chips and automatic learning[J]. IEEE Communications Magazine, 1989, 27(11): 41-46.
[4] SIKHA O K, BHARATH B. Vgg16-random fourier hybrid model for masked face recognition[J]. Soft Computing, 2022, 26(22): 12795-12810.
[5] JOARDAR B K, DOPPA J R, LI H, et al. Realprune: reram crossbar-aware lottery ticket pruning for cnns[J]. IEEE Transactions on Emerging Topics in Computing, 2022, 11(2): 303-317.
[6] YANG S, HE S, DUAN H, et al. Apq: automated dnn pruning and quantization for reram-based accelerators[J]. IEEE Transactions on Parallel and Distributed Systems, 2023, 34(9): 2498-2511.
[7] SHIN H, PARK R, LEE S Y, et al. Effective zero compression on reram-based sparse dnn accelerators[C]//Proceedings of the 59th ACM/IEEE Design Automation Conference. New York: Association for Computing Machinery, 2022: 949-954.
[8] 刘阳, 滕颖蕾, 牛涛, 等. 基于深度强化学习的滤波器剪枝方案[J]. 北京邮电大学学报, 2023, 46(3): 31-36.
LIU Y, TENG Y L, NIU T, et al. Filter pruning algorithm based on deep reinforcement learning[J]. Journal of Beijing University of Posts and Telecommunications, 2023, 46(3): 31-36.
[9] PENG Y, KIM K, WU F, et al. Structured pruning of self-supervised pre-trained models for speech recognition and understanding[C] //ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Rhodes Island: IEEE, 2023: 1-5.
[10] PENG J, LIU H, ZHAO Z, et al. Cmq: crossbar-aware neural network mixed-precision quantization via differentiable architecture search[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(11): 4124-4133.
[11] 吴昊. 深度学习数据并行训练中低秩分解梯度压缩算法的系统支持[D]. 合肥: 中国科学技术大学, 2023
[12] YANG X, YANG H, DOPPA J R, et al. Essence: exploiting structured stochastic gradient pruning for endurance-aware reram-bassed in-memory training systems[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 42(7): 2187-2199.
[13] PENG J, LI Z, LIU H, et al. Network pruning towards highly efficient rram accelerator[J]. IEEE Transactions on Nanotechnology, 2022, 21: 340-351.
[14] CHIOU C Y, LEE K T, HUANG C R, et al. Admm-srnet: alternating direction method of multipliers based sparse representation network for one-class classification[J]. IEEE Transactions on Image Processing, 2023, 32: 2843-2856.
[15] ZHU Z, SUN H, QIU K, et al. MNSIM 2.0: a behavior-level modeling tool for memristor-based neuromorphic computing systems[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems., 2023, 42(11): 4112-4125.
[16] HU M, STRACHAN J P, LI Z, et al. Dot-product engine for neuromorphic computing: programming 1t1m crossbar to accelerate matrix-vector multiplication[C]//Proceedings of the 53rd annual design automation conference. New York: Association for Computing Machinery. 2016: 1-6.
[17] MURALIMANOHAR N, BALASUBRAMONIAN R, JOIPPI N. Optimizing nuca organizations and wiring alternatives for large caches with CACTI 6.0[C]//40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007) . Chicago: IEEE, 2007: 3-14.
[18] SCHAK M, GEPPERTH A. Gesture mnist: a new free-hand gesture dataset[C]//International Conference on Artificial Neural Networks. Switzerland, Bristol: Springer Cham, 2022: 657-668.
[19] KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[J]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4): 1-60.
[20] ZHANG C, ZHANG H, TIAN F, et al. Research on sheep face recognition algorithm based on improved alexnet model[J]. Neural Computing and Applications, 2023, 35(36): 24971-24979.
[21] CHEN P, HE S, ZHANG X, et al. Accelerating tensor swapping in gpus with self-tuning compression[J]. IEEE Transactions on Parallel and Distributed Systems, 2022, 33(12): 4484-4498.
[22] CHU C, WANG Y, ZHAO Y, et al. Pim-prune: fine-grain dcnn pruning for crossbar-based process-in-memory architecture[C]//2020 57th ACM/IEEE Design Automation Conference (DAC) . San Francisco: IEEE, 2020: 1-6.
[23] YU S, LIU Y, ZHANG L, et al. High area/energy efficiency rram cnn accelerator with kernel-reordering weight mapping scheme based on pattern pruning[C]//2021 IEEE 10th Non-Volatile Memory Systems and Applications Symposium (NVMSA). Beijing: IEEE, 2021: 1-6.
[24] YANG S, CHEN W, ZHANG X, et al. Auto-prune: automated dnn pruning and mapping for reram-based accelerator[C]//Proceedings of the ACM International Conference on Supercomputing. New York: Association for Computing Machinery, 2021: 304-315.
[25] SHEN Z, WU J, JIANG X, et al. Prap-pim: a weight pattern reusing aware pruning method for reram-based pim dnn accelerators[J]. High-Confidence Computing, 2023, 3(2): 100123.
[1] Zeng An, Wang Dan, Yang Bao-yao, Zhang Xiao-bo, Shi Zhen-wei, Liu Zai-yi, Pan Dan. Lung Tumor Segmentation Method based on Transformer and Attention Mechanisms [J]. Journal of Guangdong University of Technology, 2025, 0(0): 0-.doi: 10.12052/gdutxb.240011
[2] Xie Wei-li, Zhang Jun. A Multi-layer Convolutional Sparse Coding Network Based on Multi-Scale [J]. Journal of Guangdong University of Technology, 2024, 41(06): 125-132.doi: 10.12052/gdutxb.240011
[3] Lin Hao, Chen Ping-hua. Factor-level Feature and Attribute Preference Joint Learning Based Session Recommendation [J]. Journal of Guangdong University of Technology, 2024, 41(06): 91-100.doi: 10.12052/gdutxb.240011
[4] Fan Jun-yu, Song Li-feng. A Fast Combination Algorithm for HEVC Intra-Prediction Based on Neural Network [J]. Journal of Guangdong University of Technology, 2024, 41(03): 131-140.doi: 10.12052/gdutxb.240011
[5] Yin Dan-li, Ling Jie. Android Malware Application Detection Method Based on Heterogeneous Information Network [J]. Journal of Guangdong University of Technology, 2024, 41(02): 56-64.doi: 10.12052/gdutxb.240011
[6] Chen Rui, Cai Nian, Luo Zhi-hao, Liu Xuan, Li Jian. Individual Survival Analysis of Breast Cancer Based on Multi-task Recurrent Neural Network Banded Regression Model [J]. Journal of Guangdong University of Technology, 2024, 41(01): 34-40.doi: 10.12052/gdutxb.240011
[7] Huang Xiao-yong, Li Wei-tong. Fall Detection Algorithm Based on TSSI and STB-CNN [J]. Journal of Guangdong University of Technology, 2023, 40(04): 53-59.doi: 10.12052/gdutxb.240011
[8] Xie Guo-bo, Lin Li, Lin Zhi-yi, He Di-xuan, Wen Gang. An Insulator Burst Defect Detection Method Based on YOLOv4-MP [J]. Journal of Guangdong University of Technology, 2023, 40(02): 15-21.doi: 10.12052/gdutxb.240011
[9] Zhang Rui, Lyu Jun. Single-channel Speech Separation Based on Separated SI-SNR Regression Estimation and Adaptive Frequency Modulation Network [J]. Journal of Guangdong University of Technology, 2023, 40(02): 45-54.doi: 10.12052/gdutxb.240011
[10] Qiu Jun-hao, Cheng Zhi-jian, Lin Guo-huai, Ren Hong-ru, Lu Ren-quan. Prescribed Performance Control for a Class of Nonlinear Pure-feedback Systems with Actuator Faults [J]. Journal of Guangdong University of Technology, 2023, 40(02): 55-63.doi: 10.12052/gdutxb.240011
[11] Chen Jing-yu, Lyu Yi. Frost Detection Method of Cold Chain Refrigerating Machine Based on Spiking Neural Network [J]. Journal of Guangdong University of Technology, 2023, 40(01): 29-38.doi: 10.12052/gdutxb.240011
[12] Ye Wen-quan, Li Si, Ling Jie. Sparse-view SPECT Image Reconstruction Based on Multilevel-residual U-Net [J]. Journal of Guangdong University of Technology, 2023, 40(01): 61-67.doi: 10.12052/gdutxb.240011
[13] Peng Mei-chun, Yang Chen, Li Jun-ping, Ye Wei-bin, Huang Wen-wei. A Research on Vehicle Carbon Emission Calculating Method Based on BP Neural Network [J]. Journal of Guangdong University of Technology, 2023, 40(01): 107-112.doi: 10.12052/gdutxb.240011
[14] Liu Hong-wei, Lin Wei-zhen, Wen Zhan-ming, Chen Yan-jun, Yi Min-qi. A MABM-based Model for Identifying Consumers' Sentiment Polarity―Taking Movie Reviews as an Example [J]. Journal of Guangdong University of Technology, 2022, 39(06): 1-9.doi: 10.12052/gdutxb.240011
[15] Zhang Yun, Wang Xiao-dong. A Review and Thinking of Deep Learning with a Restricted Number of Samples [J]. Journal of Guangdong University of Technology, 2022, 39(05): 1-8.doi: 10.12052/gdutxb.240011
Viewed
Full text
103
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 103 0

  From Others local
  Times 11 92
  Rate 11% 89%

Abstract
48
Just accepted Online first Issue
0 47 0
  From local
  Times 48
  Rate 100%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!