Abstract:
In order to train a sparse neural network, the Log-sum function is chosen as the regularization term and the cross entropy function as the loss function to establish the model and solved it by proximal gradient algorithm, whose learning rate was obtained by Meta-LR-Schedule-Net. The network is trained more sparsely by the model with the Log-sum regularized function than by other models with other induced sparse functions, such as the 1-norm function, transformed 1-norm function, and 1/2-norm function, according to numerical experiments conducted on four data sets: MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100. Additionally, the network trained by using the Meta-LR-Schedule-Net learning rule is more accurate than those trained by using a fixed learning rate.