一种基于多尺度的多层卷积稀疏编码网络

doi:10.12052/gdutxb.230205

Abstract

Abstract: In recent years, the Multi-layer convolutional sparse coding (ML-CSC) model has been regarded as a theoretical explanation for convolutional neural networks (CNN). While the ML-CSC model performs well on datasets with high feature contrast, its performance is not satisfactory on datasets with low feature contrast. To address this issue, this paper introduces a multi-scale technique to design a multi-scale multi-layer convolutional sparse coding network (MSMCSCNet), which not only achieves better image classification results in scenarios with weak feature contrast, but also provides the model with a solid theoretical foundation and higher interpretability. Experimental results demonstrate that, without increasing the parameter count, MSMCSCNet achieves accuracy improvements of 5.75, 9.75, and 9.8 percentage points on the Cifar10, Cifar100 datasets, and the Imagenet32 subset, respectively, compared to existing ML-CSC models. Furthermore, ablation experiments further validate the effectiveness of the model's multi-scale design and feature selection mechanism.

Key words: multi-layer convolutional sparse coding, convolutional neural network, image classification, multi-scale

CLC Number:

TP391

Xie Wei-li, Zhang Jun. A Multi-layer Convolutional Sparse Coding Network Based on Multi-Scale[J].Journal of Guangdong University of Technology, 2024, 41(06): 125-132.

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

URL: https://xbzrb.gdut.edu.cn/EN/10.12052/gdutxb.230205

https://xbzrb.gdut.edu.cn/EN/Y2024/V41/I06/125

References

[1] LECUN Y, BENGIO Y, HINTON G. Deep learning [J]. Nature, 2015, 521(7553): 436-444.
[2] NERCESSIAN S C, PANETTA K A, AGAIAN S S. Non-linear direct multi-scale image enhancement based on the luminance and contrast masking characteristics of the human visual system [J]. IEEE Transactions on Image Processing, 2013, 22(9): 3549-3561.
[3] PAPYAN V, ROMANO Y, ELAD M. Convolutional neural networks analyzed via convolutional sparse coding [J]. The Journal of Machine Learning Research, 2017, 18(1): 2887-2938.
[4] SULAM J, ABERDAM A, BECK A, et al. On multi-layer basis pursuit, efficient algorithms and convolutional neural networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 42(8): 1968-1980.
[5] PAPYAN V, SULAM J, ELAD M. Working locally thinking globally: theoretical guarantees for convolutional sparse coding [J]. IEEE Transactions on Signal Processing, 2017, 65(21): 5687-5701.
[6] CHEN S S, DONOHO D L, SAUNDERS M A. Atomic decomposition by basis pursuit [J]. SIAM Review, 2001, 43(1): 129-159.
[7] TROPP J A, GILBERT A C. Signal recovery from random measurements via orthogonal matching pursuit [J]. IEEE Transactions on Information Theory, 2007, 53(12): 4655-4666.
[8] DAUBECHIES I, DEFRISE M, DE MOL C. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint [J]. Communications on Pure and Applied Mathematics:A Journal Issued by the Courant Institute of Mathematical Sciences, 2004, 57(11): 1413-1457.
[9] BECK A, TEBOULLE M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems [J]. SIAM Journal on Imaging Sciences, 2009, 2(1): 183-202.
[10] BOYD S, PARIKH N, CHU E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers [J]. Foundations and Trends^® in Machine Learning, 2011, 3(1): 1-122.
[11] SIMON D, ELAD M. Rethinking the CSC model for natural images[J]. Advances in Neural Information Processing Systems, 2019(204): 2274-2284.
[12] GUO P, ZENG D, TIAN Y, et al. Multi-scale enhancement fusion for underwater sea cucumber images based on human visual system modelling [J]. Computers and Electronics in Agriculture, 2020, 175: 105608.
[13] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90.
[14] OLIMOV B, SUBRAMANIAN B, UGLI R A A, et al. Consecutive multiscale feature learning-based image classification model [J]. Scientific Reports, 2023, 13(1): 3595.
[15] NATARAJAN B K. Sparse approximate solutions to linear systems [J]. SIAM Journal on Computing, 1995, 24(2): 227-234.
[16] CANDÈS E J, ROMBERG J, TAO T. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information [J]. IEEE Transactions on Information Theory, 2006, 52(2): 489-509.
[17] TIBSHIRANI R. Regression shrinkage and selection via the lasso [J]. Journal of the Royal Statistical Society Series B:Statistical Methodology, 1996, 58(1): 267-288.
[18] DONOHO D L, ELAD M. Optimally sparse representation in general (nonorthogonal) dictionaries via ?1 minimization [J]. Proceedings of the National Academy of Sciences, 2003, 100(5): 2197-2202.
[19] RUBINSTEIN R, ZIBULEVSKY M, ELAD M. Double sparsity: learning sparse dictionaries for sparse signal approximation [J]. IEEE Transactions on Signal Processing, 2009, 58(3): 1553-1564.
[20] TROPP J A. Greed is good: algorithmic results for sparse approximation [J]. IEEE Transactions on Information theory, 2004, 50(10): 2231-2242.
[21] GROHS P. Mathematical aspects of deep learning[M]. Cambridge England: Cambridge University Press, 2022: 1-111.
[22] LI M, ZHAI P, TONG S, et al. Revisiting sparse convolutional model for visual recognition [J]. Advances in Neural Information Processing Systems, 2022, 35: 10492-10504.
[23] ZHANG Z, ZHANG S. Towards understanding residual and dilated dense neural networks via convolutional sparse coding [J]. National Science Review, 2021, 8(3): nwaa159.
[24] HUANG G. Multi-scale dense networks for resource efficient image Classification[EB/OL]. arXiv: 1703.09844(2017-03-29) [2023-12-15].https://doi.org/10.48550/arXiv.1703.09844.
[25] KRIZHEVSKY A. Learning multiple layers of features from tiny images[EB/OL]. (2023-12-18) [2009-04-08].https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
[26] CHRABASZCZ, P. A downsampled variant of ImageNet as an alternative to the CIFAR datasets[EB/OL]. arXiv: 1707.08819 (2017-08-23) [2023-12-15].https://ar5iv.org/abs/1707.08819.
[27] NETZER Y. Reading digits in natural images with unsupervised feature learning[EB/OL]. (2023-12-18) [2011-12-08].https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/37648.pdf.

Related Articles 12

[1]	Tu Ze-liang, Cheng Liang-lun, Huang Guo-Heng. Local Orthogonal Feature Fusion for Few-Shot Image Classification [J]. Journal of Guangdong University of Technology, 2024, 41(02): 73-83.doi: 10.12052/gdutxb.230205
[2]	Xie Guo-bo, Lin Li, Lin Zhi-yi, He Di-xuan, Wen Gang. An Insulator Burst Defect Detection Method Based on YOLOv4-MP [J]. Journal of Guangdong University of Technology, 2023, 40(02): 15-21.doi: 10.12052/gdutxb.230205
[3]	Zhang Yun, Wang Xiao-dong. A Review and Thinking of Deep Learning with a Restricted Number of Samples [J]. Journal of Guangdong University of Technology, 2022, 39(05): 1-8.doi: 10.12052/gdutxb.230205
[4]	Huang Jian-hang, Wang Zhen-you. A Research on Deep Learning Object Detection Algorithm Based on Feature Fusion [J]. Journal of Guangdong University of Technology, 2021, 38(04): 52-58.doi: 10.12052/gdutxb.230205
[5]	Ma Shao-peng, Liang Lu, Teng Shao-hua. A Lightweight Hyperspectral Remote Sensing Image Classification Method [J]. Journal of Guangdong University of Technology, 2021, 38(03): 29-35.doi: 10.12052/gdutxb.230205
[6]	Xia Hao, Cai Nian, Wang Ping, Wang Han. Magnetic Resonance Image Super-Resolution via Multi-Resolution Learning [J]. Journal of Guangdong University of Technology, 2020, 37(06): 26-31.doi: 10.12052/gdutxb.230205
[7]	Zhan Yin-wei, Zhu Bai-wan, Yang Zhuo. Research and Application of Vehicle Color and Model Recognition Algorithm [J]. Journal of Guangdong University of Technology, 2020, 37(04): 9-14.doi: 10.12052/gdutxb.230205
[8]	Zeng Bi-qing, Han Xu-li, Wang Sheng-yu, Xu Ru-yang, Zhou Wu. Sentiment Classification Based on Double Attention Convolutional Neural Network Model [J]. Journal of Guangdong University of Technology, 2019, 36(04): 10-17.doi: 10.12052/gdutxb.230205
[9]	Yang Meng-jun, Su Cheng-yue, Chen Jing, Zhang Jie-xin. Loop Closure Detection for Visual SLAM Using Convolutional Neural Networks [J]. Journal of Guangdong University of Technology, 2018, 35(05): 31-37.doi: 10.12052/gdutxb.230205
[10]	Chen Xu, Zhang Jun, Chen Wen-wei, Li Shuo-hao. Convolutional Neural Network Algorithm and Case [J]. Journal of Guangdong University of Technology, 2017, 34(06): 20-26.doi: 10.12052/gdutxb.230205
[11]	SHEN Xiao-Min， LI Bao-Jun， SUN Xu， XU Wei-Chao. Large Scale Face Clustering Based on Convolutional Neural Network [J]. Journal of Guangdong University of Technology, 2016, 33(06): 77-84.doi: 10.12052/gdutxb.230205
[12]	DING Wei,CHENG Si-yuan,ZHANG Xiang-wei . Snake Model Based on MR-MS and Its Application [J]. Journal of Guangdong University of Technology, 2005, 22(1): 37-41.doi: 10.12052/gdutxb.230205

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

A Multi-layer Convolutional Sparse Coding Network Based on Multi-Scale

HTML

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 12

Metrics

Comments

Recommended 0