一种基于多尺度的多层卷积稀疏编码网络

doi:10.12052/gdutxb.230205

摘要/Abstract

摘要： 多层卷积稀疏编码模型(Multi-Layer Convolutional Sparse Coding, ML-CSC)被认为是对卷积神经网络(Convolutional Neural Networks, CNN)的一种理论阐释。尽管ML-CSC模型在特征对比度高的数据集上表现良好，但是其在特征对比度低的数据集上表现不佳，为了解决这一问题，本文引入多尺度技术设计了一种多尺度多层卷积稀疏编码网络(Multi-Scale Multi-Layer Convolutional Sparse Coding Network, MSMCSCNet)，不仅在特征对比度较弱的情况下得到更好的图像分类效果，而且也使模型具有扎实的理论基础和较高的可解释性。实验结果表明，MSMCSCNet在不增加参数量的前提下，在Cifar10、Cifar100数据集和Imagenet32数据子集上，准确率相比现有ML-CSC模型分别提高了5.75，9.75和9.8个百分点。此外，消融实验进一步证实了模型的多尺度设计和特征筛选模式设计的有效性。

关键词: 多层卷积稀疏编码, 卷积神经网络, 图像分类, 多尺度

Abstract: In recent years, the Multi-layer convolutional sparse coding (ML-CSC) model has been regarded as a theoretical explanation for convolutional neural networks (CNN). While the ML-CSC model performs well on datasets with high feature contrast, its performance is not satisfactory on datasets with low feature contrast. To address this issue, this paper introduces a multi-scale technique to design a multi-scale multi-layer convolutional sparse coding network (MSMCSCNet), which not only achieves better image classification results in scenarios with weak feature contrast, but also provides the model with a solid theoretical foundation and higher interpretability. Experimental results demonstrate that, without increasing the parameter count, MSMCSCNet achieves accuracy improvements of 5.75, 9.75, and 9.8 percentage points on the Cifar10, Cifar100 datasets, and the Imagenet32 subset, respectively, compared to existing ML-CSC models. Furthermore, ablation experiments further validate the effectiveness of the model's multi-scale design and feature selection mechanism.

Key words: multi-layer convolutional sparse coding, convolutional neural network, image classification, multi-scale

中图分类号:

TP391

谢伟立, 张军. 一种基于多尺度的多层卷积稀疏编码网络[J]. 广东工业大学学报,doi: 10.12052/gdutxb.230205

Xie Wei-li, Zhang Jun. A Multi-layer Convolutional Sparse Coding Network Based on Multi-Scale[J]. Journal of Guangdong University of Technology, 2024, 41(0): 0-.doi: 10.12052/gdutxb.230205

参考文献

[1] LECUN Y, BENGIO Y, HINTON G. Deep learning [J]. Nature, 2015, 521(7553): 436-444.
[2] NERCESSIAN S C, PANETTA K A, AGAIAN S S. Non-linear direct multi-scale image enhancement based on the luminance and contrast masking characteristics of the human visual system [J]. IEEE Transactions on Image Processing, 2013, 22(9): 3549-3561.
[3] PAPYAN V, ROMANO Y, ELAD M. Convolutional neural networks analyzed via convolutional sparse coding [J]. The Journal of Machine Learning Research, 2017, 18(1): 2887-2938.
[4] SULAM J, ABERDAM A, BECK A, et al. On multi-layer basis pursuit, efficient algorithms and convolutional neural networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 42(8): 1968-1980.
[5] PAPYAN V, SULAM J, ELAD M. Working locally thinking globally: theoretical guarantees for convolutional sparse coding [J]. IEEE Transactions on Signal Processing, 2017, 65(21): 5687-5701.
[6] CHEN S S, DONOHO D L, SAUNDERS M A. Atomic decomposition by basis pursuit [J]. SIAM Review, 2001, 43(1): 129-159.
[7] TROPP J A, GILBERT A C. Signal recovery from random measurements via orthogonal matching pursuit [J]. IEEE Transactions on Information Theory, 2007, 53(12): 4655-4666.
[8] DAUBECHIES I, DEFRISE M, DE MOL C. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint [J]. Communications on Pure and Applied Mathematics:A Journal Issued by the Courant Institute of Mathematical Sciences, 2004, 57(11): 1413-1457.
[9] BECK A, TEBOULLE M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems [J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183-202.
[10] BOYD S, PARIKH N, CHU E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers [J]. Foundations and Trends^® in Machine Learning, 2011, 3(1): 1-122.
[11] SIMON D, ELAD M. Rethinking the CSC model for natural images[J]. Advances in Neural Information Processing Systems, 2019(204): 2274-2284.
[12] GUO P, ZENG D, TIAN Y, et al. Multi-scale enhancement fusion for underwater sea cucumber images based on human visual system modelling [J]. Computers and Electronics in Agriculture, 2020, 175: 105608.
[13] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90.
[14] OLIMOV B, SUBRAMANIAN B, UGLI R A A, et al. Consecutive multiscale feature learning-based image classification model [J]. Scientific Reports, 2023, 13(1): 3595.
[15] NATARAJAN B K. Sparse approximate solutions to linear systems [J]. SIAM Journal on Computing, 1995, 24(2): 227-234.
[16] CANDÈS E J, ROMBERG J, TAO T. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information [J]. IEEE Transactions on Information Theory, 2006, 52(2): 489-509.
[17] TIBSHIRANI R. Regression shrinkage and selection via the lasso [J]. Journal of the Royal Statistical Society Series B:Statistical Methodology, 1996, 58(1): 267-288.
[18] DONOHO D L, ELAD M. Optimally sparse representation in general (nonorthogonal) dictionaries via ?1 minimization [J]. Proceedings of the National Academy of Sciences, 2003, 100(5): 2197-2202.
[19] RUBINSTEIN R, ZIBULEVSKY M, ELAD M. Double sparsity: learning sparse dictionaries for sparse signal approximation [J]. IEEE Transactions on Signal Processing, 2009, 58(3): 1553-1564.
[20] TROPP J A. Greed is good: algorithmic results for sparse approximation [J]. IEEE Transactions on Information theory, 2004, 50(10): 2231-2242.
[21] GROHS P. Mathematical aspects of deep learning[M]. Cambridge England: Cambridge University Press, 2022: 1-111.
[22] LI M, ZHAI P, TONG S, et al. Revisiting sparse convolutional model for visual recognition [J]. Advances in Neural Information Processing Systems, 2022, 35: 10492-10504.
[23] ZHANG Z, ZHANG S. Towards understanding residual and dilated dense neural networks via convolutional sparse coding [J]. National Science Review, 2021, 8(3): nwaa159.
[24] HUANG G. Multi-scale dense networks for resource efficient image Classification[EB/OL]. arXiv: 1703.09844(2017-03-29) [2023-12-15].https://doi.org/10.48550/arXiv.1703.09844.
[25] KRIZHEVSKY A. Learning multiple layers of features from tiny images[EB/OL]. (2023-12-18) [2009-04-08].https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
[26] CHRABASZCZ, P. A downsampled variant of ImageNet as an alternative to the CIFAR datasets[EB/OL]. arXiv: 1707.08819 (2017-08-23) [2023-12-15].https://ar5iv.org/abs/1707.08819.
[27] NETZER Y. Reading digits in natural images with unsupervised feature learning[EB/OL]. (2023-12-18) [2011-12-08].https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/37648.pdf.

相关文章 14

[1]	涂泽良, 程良伦, 黄国恒. 基于局部正交特征融合的小样本图像分类[J]. 广东工业大学学报, 2024, 41(02): 73-83.
[2]	曾安, 赖峻浩, 杨宝瑶, 潘丹. 基于多尺度卷积和注意力机制的病理图像分割网络[J]. 广东工业大学学报, 2024, 41(0): 0-.
[3]	黄晓湧, 李伟彤. 基于TSSI和STB-CNN的跌倒检测算法[J]. 广东工业大学学报, 2023, 40(04): 53-59.
[4]	谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[5]	章云, 王晓东. 基于受限样本的深度学习综述与思考[J]. 广东工业大学学报, 2022, 39(05): 1-8.
[6]	黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58.
[7]	马少鹏, 梁路, 滕少华. 一种轻量级的高光谱遥感图像分类方法[J]. 广东工业大学学报, 2021, 38(03): 29-35.
[8]	夏皓, 蔡念, 王平, 王晗. 基于多分辨率学习卷积神经网络的磁共振图像超分辨率重建[J]. 广东工业大学学报, 2020, 37(06): 26-31.
[9]	战荫伟, 朱百万, 杨卓. 车辆颜色和型号识别算法研究与应用[J]. 广东工业大学学报, 2020, 37(04): 9-14.
[10]	曾碧卿, 韩旭丽, 王盛玉, 徐如阳, 周武. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报, 2019, 36(04): 10-17.
[11]	杨孟军, 苏成悦, 陈静, 张洁鑫. 基于卷积神经网络的视觉闭环检测研究[J]. 广东工业大学学报, 2018, 35(05): 31-37.
[12]	陈旭, 张军, 陈文伟, 李硕豪. 卷积网络深度学习算法与实例[J]. 广东工业大学学报, 2017, 34(06): 20-26.
[13]	申小敏，李保俊，孙旭，徐维超. 基于卷积神经网络的大规模人脸聚类[J]. 广东工业大学学报, 2016, 33(06): 77-84.
[14]	丁炜；成思源；张湘伟； . 基于MR-MS的snake模型及其应用[J]. 广东工业大学学报, 2005, 22(1): 37-41.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed