基于Log-sum正则化的稀疏神经网络

Sparse Neural Network Based on Log-sum Regularization

摘要: 为了训练出一个稀疏的神经网络，本文选用Log-sum函数作为正则项，交叉熵函数作为损失项建立模型，随后用邻近梯度法对它进行求解，其学习率用Meta-LR-Schedule-Net网络训练得到。在MNIST、Fashion-MNIST、CIFAR-10和CIFAR-100四个数据集上的数值实验结果表明：在同样的学习率规则下，用Log-sum函数作为正则项比用其他能诱导稀疏的函数，如向量1范数、截尾向量1范数或向量1/2范数等作为正则项建立的模型能把网络训练得更稀疏；在稀疏度近似的情况下，用Meta-LR-Schedule-Net训练得到的学习率，比使用固定规则训练得到的网络有更高的正确率。

Abstract: In order to train a sparse neural network, the Log-sum function is chosen as the regularization term and the cross entropy function as the loss function to establish the model and solved it by proximal gradient algorithm, whose learning rate was obtained by Meta-LR-Schedule-Net. The network is trained more sparsely by the model with the Log-sum regularized function than by other models with other induced sparse functions, such as the 1-norm function, transformed 1-norm function, and 1/2-norm function, according to numerical experiments conducted on four data sets: MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100. Additionally, the network trained by using the Meta-LR-Schedule-Net learning rule is more accurate than those trained by using a fixed learning rate.