广东工业大学学报 ›› 2022, Vol. 39 ›› Issue (05): 1-8.doi: 10.12052/gdutxb.220092

• •    下一篇

基于受限样本的深度学习综述与思考

章云, 王晓东   

  1. 广东工业大学 自动化学院,广东 广州 510006
  • 收稿日期:2022-05-13 出版日期:2022-09-10 发布日期:2022-07-18
  • 通信作者: 王晓东(1992–),男,博士研究生,主要研究方向为图像处理与稀疏网络优化,E-mail:wdecen@foxmail.com
  • 作者简介:章云(1963–),男,教授,博士,博士生导师,主要研究方向为复杂系统建模与控制、图像处理与模式识别等
  • 基金资助:
    国家自然科学基金资助项目(U1501251,61802070,62103115)

A Review and Thinking of Deep Learning with a Restricted Number of Samples

Zhang Yun, Wang Xiao-dong   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2022-05-13 Online:2022-09-10 Published:2022-07-18

摘要: 深度学习目前依靠大数据和强算力取得了较大进展,但在样本受限情况下的表现差强人意,主要问题在于函数空间(簇)的建构和在数据集受限情况下算法的设计。据此,本文对受限样本下的深度学习进行了分类综述。另外,从目前对大脑的研究来看,人的认知过程在大脑中是分区域的,每个区域担负的功能是不同的,对每个区域功能的学习过程也应该是有差异的。因此,提出了“功能进阶”式的深度学习的设想,试图构建分区分层多种功能模块组成的网络结构,研究“进阶”式的功能模块训练方法,以期探求“仿人学习”的新路径。

关键词: 深度学习方法, 卷积神经网络, 小样本学习, 功能进阶

Abstract: Deep learning has achieved great success with big data and powerful computing, but its performance is poor under sample constraint, mainly due to the construction of function space (clusters) and the design of algorithms under dataset constraint. Accordingly, a categorical review of deep learning under restricted samples is presented. In addition, according to the current research on the brain, the cognitive process of humankind is categorized in the brain with different regions, and the cognitive functions of each region are also different. Therefore, the training function of each region should also be different. At this point, an idea of deep learning method using functional evolution is proposed, trying to create a network structure composed of multiple functional modules, and the training procedure of the functional module used in this method is studied, aiming to explore the new area of "humanoid learning".

Key words: deep learning method, convolutional neural network, restricted sample learning, functional evolution

中图分类号: 

  • TP183
[1] 徐宗本. 人工智能的十个重大数理基础问题[J]. 中国科学:信息科学, 2021, 51(12): 1967-1978.
XU Z B. Ten fundamental problems for artificial intelligence: mathematical and physical aspects [J]. Scientia Sinica(Informationis), 2021, 51(12): 1967-1978.
[2] WANG Y, YAO Q, KWOK J T, et al. Generalizing from a few examples: a survey on few-shot learning [J]. ACM Computing Surveys (Csur), 2020, 53(3): 1-34.
[3] LIU B, WANG X D, DIXIT M, et al. Feature space transfer for data augmentation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 9090-9098.
[4] TSAI Y H H, SALAKHUTDINOV R. Improving one-shot learning through fusing side information[EB/OL]. (2018-01-23)[2022-05-13]. https://arxiv.org/abs/1710.08347.
[5] TRIANTAFILLOU E, ZEMEL R, URTASUN R. Few-shot learning through an information retrieval lens[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017, 30: 2252-2262.
[6] YAN L M, ZHENG Y H, CAO J. Few-shot learning for short text classification [J]. Multimedia Tools and Applications, 2018, 77(22): 29799-29810.
[7] RAVI S, LAROCHELLE H. Optimization as a model for few-shot learning[C] //Proceedings of the International Conference on Learning Representations. Cambridge: ICLR, 2017: 1-11.
[8] YOO D, FAN H, BODDETI V N, et al. Efficient k-shot learning with regularized deep networks[EB/OL]. (2017-10-06) [2022-05-13]. https://arxiv.org/abs/1710.02277.
[9] HONG J, FANG P, LI W, et al. Reinforced attention for few-shot learning and beyond [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2021: 913-923.
[10] ELSKEN T, STAFFLER B, METZEN J H, et al. Meta-learning of neural architectures for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2020: 12365-12375.
[11] PAN S J, YANG Q. A survey on transfer learning [J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 22(10): 1345-1359.
[12] ZHUANG F, QI Z, DUAN K, et al. A comprehensive survey on transfer learning [J]. Proceedings of the IEEE, 2020, 109(1): 43-76.
[13] JIANG J, ZHAI C. Instance weighting for domain adaptation in NLP[C]// Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2007: 264-271.
[14] DAI W, YANG Q, XUE G R, et al. Boosting for transfer learning[C]// Proceedings of the International Conference on Machine Learning. New York: PMLR, 2007: 193-200.
[15] SUGIYAMA M, NAKAJIMA S, KASHIMA H, et al. Direct importance estimation with model selection and its application to covariate shift adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 1433-1440.
[16] SAITO K, USHIKU Y, HARADA T. Asymmetric tri-training for unsupervised domain adaptation[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 2988-2997.
[17] DUAN L, TSANG I W, XU D. Domain transfer multiple kernel learning [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(3): 465-479.
[18] ZHUANG F, LUO P, YIN P, et al. Concept learning for cross-domain text classification: a general probabilistic framework[C]//Twenty-Third International Joint Conference on Artificial Intelligence. Amsterdam: Elsevier, 2013: 1960-1966.
[19] BONILLA E V, CHAI K, WILLIAMS C. Multi-task Gaussian process prediction[C] //Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 153-160.
[20] SCHWAIGHOFER A, TRESP V, YU K. Learning Gaussian process kernels via hierarchical Bayes[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2004, 17: 1209-1216.
[21] MIHALKOVA L, MOONEY R J. Transfer learning by mapping with minimal target data[C] //Proceedings of the AAAI-08 Workshop on Transfer Learning for Complex Tasks. Palo Alto: AAAI Press, 2008: 1163-1168.
[22] DAVIS J, DOMINGOS P. Deep transfer via second-order markov logic [C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2009: 217-224.
[23] TAN B, SONG Y, ZHONG E, et al. Transitive transfer learning[C] //Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2015: 1155-1164.
[24] TAN B, ZHANG Y, PAN S, et al. Distant domain transfer learning [C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017, 31: 2604-2610.
[25] YOU K C, LIU Y, WANG J M, et al. Logme: practical assessment of pre-trained models for transfer learning[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2021: 12133-12143.
[26] BEN-DAVID S, BLITZER J, CRAMMER K, et al. Analysis of representations for domain adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2006, 19: 137-144.
[27] BLITZER J, CRAMMER K, KULESZA A, et al. Learning bounds for domain adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 129-136.
[28] BEN-DAVID S, BLITZER J, CRAMMER K, et al. A theory of learning from different domains [J]. Machine learning, 2010, 79(1): 151-175.
[29] HUISMAN M, VANRIJN J N, PLAAT A. A survey of deep meta-learning [J]. Artificial Intelligence Review, 2021, 54(6): 4483-4541.
[30] HOSPEDALES T, ANTONIOU A, MICAELLI P, et al. Meta-learning in neural networks: a survey[EB/OL]. https://arxiv.org/abs/2004.05439.
[31] KOCH G, ZEMEL R, SALAKHUTDINOV R. Siamese neural networks for one-shot image recognition[C] //ICML deep learning workshop. New York: PMLR, 2015, 2: 1-8.
[32] VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[C]//Advances in Neural Information Processing Systems. Cambridge, Massachusetts: MIT Press, 2016, 29: 3637-3645.
[33] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017, 30: 4080-4090.
[34] SUNG F, YANG Y, ZHANG L, et al. Learning to compare: Relation network for few-shot learning[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 1199-1208.
[35] SANTORO A, BARTUNOV S, BOTVINICK M, et al. Meta-learning with memory-augmented neural networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2016: 1842–1850.
[36] MUNKHDALAI T, YU H. Meta networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 2554-2563.
[37] CAI Q, PAN Y, YAO T, et al. Memory matching networks for one-shot image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 4080-4088.
[38] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 1126-1135.
[39] RUSU A A, RAO D, SYGNOWSKI J, et al. Meta-learning with latent embedding optimization[EB/OL]. (2019-03-26)[2022-05-13]. https://arxiv.org/abs/1807.05960v3.
[40] LEE Y, CHOI S. Gradient-based meta-learning with learned layerwise metric and subspace[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2018: 2927-2936.
[41] Al-SHEDIVAT M, BANSAL T, BURDA Y, et al. Continuous adaptation via meta-learning in nonstationary and competitive environments[EB/OL]. (2018-02-23)[2022-05-13]. https://arxiv.org/abs/1710.03641.
[42] GRANT E, FINN C, LEVINE S, et al. Recasting gradient-based meta-learning as hierarchical bayes[EB/OL]. (2018-01-26)[2022-05-13]. https://arxiv.org/abs/1801.08930v1.
[43] NAGABANDI A, CLAVERA I, LIU S, et al. Learning to adapt in dynamic, real-world environments through meta-reinforcement learning[EB/OL]. (2019-02-27)[2022-05-13]. https://arxiv.org/abs/1803.11347v6.
[44] BENGIO Y, LOURADOUR J, COLLOBERT R, et al. Curriculum learning[C] //Proceedings of the International Conference on Machine Learning. New York: PMLR, 2009: 41-48.
[45] KUMAR M P, PACKER B, KOLLER D. Self-paced learning for latent variable models[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2010, 23: 1189-1197.
[46] JIANG L, MENG D Y, MITAMURA T, et al. Easy samples first: self-paced reranking for zero-example multimedia search[C]// Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2014: 547-556.
[47] ZHANG D W, MENG D Y, HAN J W. Co-saliency detection via a self-paced multiple-instance learning framework [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(5): 865-878.
[48] JIANG L, MENG D Y, ZHAO Q, et al. Self-paced curriculum learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015: 2694-2700.
[49] SUPANCIC J S, RAMANAN D. Self-paced learning for long-term tracking[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2013: 2379-2386.
[50] MENG D Y, ZHAO Q, JIANG L. A theoretical understanding of self-paced learning [J]. Information Sciences, 2017, 414: 319-328.
[51] MA Z L, LIU S Q, MENG D Y, et al. On convergence properties of implicit self-paced objective [J]. Information Sciences, 2018, 462: 132-140.
[52] LIU S Q, MA Z L, MENG D Y, et al. Understanding self-paced learning under concave conjugacy theory [J]. Communications in Information and Systems, 2018, 1(1): 1-27.
[53] 束俊, 孟德宇, 徐宗本. 元自步学习[J]. 中国科学: 信息科学, 2020, 50(6): 781-793.
SHU J, MENG D Y. XU Z B, Meta self-paced learning [J]. Scientia Sinica (Informationis), 2020, 50(6): 781-793.
[54] WANG X, CHEN Y D, ZHU W W. A survey on curriculum learning [J/OL]. IEEE Transactions on Pattern Analysis and Machine Intelligence. (2021-03-31)[2022-05-13]. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9392296.
[55] JIANG L, ZHOU Z, LEUNG T, et al. Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2018: 2304-2313.
[56] KIM T-H, CHOI J. Screenernet: learning self-paced curriculum for deep neural networks [EB/OL]. (2018-01-03)[2022-05-13]. https://arxiv.org/abs/1801.00904.
[57] WU L, TIAN F, XIA Y, et al. Learning to teach with dynamic loss functions[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2018, 31: 6467-6478.
[58] GUO S, HUANG W, ZHANG H, et al. Curriculumnet: weakly supervised learning from large-scale web images [C] // Computer Vision – ECCV 2018.Munich: European Conference on Computer Vision (ECCV), 2018: 135-150.
[59] SHU Y, CAO Z, LONG M, et al. Transferable curriculum for weakly-supervised domain adaptation[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 33(1): 4951-4958.
[60] TSVETKOV Y, FARUQUI M, LING W, et al. Learning the curriculum with bayesian optimization for task-specific word representation learning[C] //Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2016: 130-139.
[61] SAXENA S, TUZEL O, DECOSTE D. Data parameters: a new family of parameters for learning a differentiable curriculum[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2019, 32: 11095-11105.
[62] ZHANG X, KUMAR G, KHAYRALLAH H, et al. An empirical exploration of curriculum learning for neural machine translation[EB/OL]. (2018-11-02)[2022-05-13]. https://arxiv.org/abs/1811.00739.
[63] HACOHEN G, WEINSHALL D. On the power of curriculum learning in training deep networks[C]// Proceedings of the International Conference on Machine Learning. New York: PMLR, 2019: 2535-2544.
[64] WANG W, CASWELL I, CHELBA C. Dynamically composing domain-data selection with clean-data selection by "co-curricular learning" for neural machine translation[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 1282-1292.
[1] 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[2] 黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58.
[3] 马少鹏, 梁路, 滕少华. 一种轻量级的高光谱遥感图像分类方法[J]. 广东工业大学学报, 2021, 38(03): 29-35.
[4] 夏皓, 蔡念, 王平, 王晗. 基于多分辨率学习卷积神经网络的磁共振图像超分辨率重建[J]. 广东工业大学学报, 2020, 37(06): 26-31.
[5] 战荫伟, 朱百万, 杨卓. 车辆颜色和型号识别算法研究与应用[J]. 广东工业大学学报, 2020, 37(04): 9-14.
[6] 曾碧卿, 韩旭丽, 王盛玉, 徐如阳, 周武. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报, 2019, 36(04): 10-17.
[7] 杨孟军, 苏成悦, 陈静, 张洁鑫. 基于卷积神经网络的视觉闭环检测研究[J]. 广东工业大学学报, 2018, 35(05): 31-37.
[8] 陈旭, 张军, 陈文伟, 李硕豪. 卷积网络深度学习算法与实例[J]. 广东工业大学学报, 2017, 34(06): 20-26.
[9] 申小敏, 李保俊, 孙旭, 徐维超. 基于卷积神经网络的大规模人脸聚类[J]. 广东工业大学学报, 2016, 33(06): 77-84.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!