Journal of Guangdong University of Technology ›› 2022, Vol. 39 ›› Issue (05): 1-8.doi: 10.12052/gdutxb.220092

    Next Articles

A Review and Thinking of Deep Learning with a Restricted Number of Samples

Zhang Yun, Wang Xiao-dong   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2022-05-13 Online:2022-09-10 Published:2022-07-18

Abstract: Deep learning has achieved great success with big data and powerful computing, but its performance is poor under sample constraint, mainly due to the construction of function space (clusters) and the design of algorithms under dataset constraint. Accordingly, a categorical review of deep learning under restricted samples is presented. In addition, according to the current research on the brain, the cognitive process of humankind is categorized in the brain with different regions, and the cognitive functions of each region are also different. Therefore, the training function of each region should also be different. At this point, an idea of deep learning method using functional evolution is proposed, trying to create a network structure composed of multiple functional modules, and the training procedure of the functional module used in this method is studied, aiming to explore the new area of "humanoid learning".

Key words: deep learning method, convolutional neural network, restricted sample learning, functional evolution

CLC Number: 

  • TP183
[1] 徐宗本. 人工智能的十个重大数理基础问题[J]. 中国科学:信息科学, 2021, 51(12): 1967-1978.
XU Z B. Ten fundamental problems for artificial intelligence: mathematical and physical aspects [J]. Scientia Sinica(Informationis), 2021, 51(12): 1967-1978.
[2] WANG Y, YAO Q, KWOK J T, et al. Generalizing from a few examples: a survey on few-shot learning [J]. ACM Computing Surveys (Csur), 2020, 53(3): 1-34.
[3] LIU B, WANG X D, DIXIT M, et al. Feature space transfer for data augmentation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 9090-9098.
[4] TSAI Y H H, SALAKHUTDINOV R. Improving one-shot learning through fusing side information[EB/OL]. (2018-01-23)[2022-05-13]. https://arxiv.org/abs/1710.08347.
[5] TRIANTAFILLOU E, ZEMEL R, URTASUN R. Few-shot learning through an information retrieval lens[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017, 30: 2252-2262.
[6] YAN L M, ZHENG Y H, CAO J. Few-shot learning for short text classification [J]. Multimedia Tools and Applications, 2018, 77(22): 29799-29810.
[7] RAVI S, LAROCHELLE H. Optimization as a model for few-shot learning[C] //Proceedings of the International Conference on Learning Representations. Cambridge: ICLR, 2017: 1-11.
[8] YOO D, FAN H, BODDETI V N, et al. Efficient k-shot learning with regularized deep networks[EB/OL]. (2017-10-06) [2022-05-13]. https://arxiv.org/abs/1710.02277.
[9] HONG J, FANG P, LI W, et al. Reinforced attention for few-shot learning and beyond [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2021: 913-923.
[10] ELSKEN T, STAFFLER B, METZEN J H, et al. Meta-learning of neural architectures for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2020: 12365-12375.
[11] PAN S J, YANG Q. A survey on transfer learning [J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 22(10): 1345-1359.
[12] ZHUANG F, QI Z, DUAN K, et al. A comprehensive survey on transfer learning [J]. Proceedings of the IEEE, 2020, 109(1): 43-76.
[13] JIANG J, ZHAI C. Instance weighting for domain adaptation in NLP[C]// Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2007: 264-271.
[14] DAI W, YANG Q, XUE G R, et al. Boosting for transfer learning[C]// Proceedings of the International Conference on Machine Learning. New York: PMLR, 2007: 193-200.
[15] SUGIYAMA M, NAKAJIMA S, KASHIMA H, et al. Direct importance estimation with model selection and its application to covariate shift adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 1433-1440.
[16] SAITO K, USHIKU Y, HARADA T. Asymmetric tri-training for unsupervised domain adaptation[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 2988-2997.
[17] DUAN L, TSANG I W, XU D. Domain transfer multiple kernel learning [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(3): 465-479.
[18] ZHUANG F, LUO P, YIN P, et al. Concept learning for cross-domain text classification: a general probabilistic framework[C]//Twenty-Third International Joint Conference on Artificial Intelligence. Amsterdam: Elsevier, 2013: 1960-1966.
[19] BONILLA E V, CHAI K, WILLIAMS C. Multi-task Gaussian process prediction[C] //Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 153-160.
[20] SCHWAIGHOFER A, TRESP V, YU K. Learning Gaussian process kernels via hierarchical Bayes[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2004, 17: 1209-1216.
[21] MIHALKOVA L, MOONEY R J. Transfer learning by mapping with minimal target data[C] //Proceedings of the AAAI-08 Workshop on Transfer Learning for Complex Tasks. Palo Alto: AAAI Press, 2008: 1163-1168.
[22] DAVIS J, DOMINGOS P. Deep transfer via second-order markov logic [C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2009: 217-224.
[23] TAN B, SONG Y, ZHONG E, et al. Transitive transfer learning[C] //Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2015: 1155-1164.
[24] TAN B, ZHANG Y, PAN S, et al. Distant domain transfer learning [C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017, 31: 2604-2610.
[25] YOU K C, LIU Y, WANG J M, et al. Logme: practical assessment of pre-trained models for transfer learning[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2021: 12133-12143.
[26] BEN-DAVID S, BLITZER J, CRAMMER K, et al. Analysis of representations for domain adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2006, 19: 137-144.
[27] BLITZER J, CRAMMER K, KULESZA A, et al. Learning bounds for domain adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 129-136.
[28] BEN-DAVID S, BLITZER J, CRAMMER K, et al. A theory of learning from different domains [J]. Machine learning, 2010, 79(1): 151-175.
[29] HUISMAN M, VANRIJN J N, PLAAT A. A survey of deep meta-learning [J]. Artificial Intelligence Review, 2021, 54(6): 4483-4541.
[30] HOSPEDALES T, ANTONIOU A, MICAELLI P, et al. Meta-learning in neural networks: a survey[EB/OL]. https://arxiv.org/abs/2004.05439.
[31] KOCH G, ZEMEL R, SALAKHUTDINOV R. Siamese neural networks for one-shot image recognition[C] //ICML deep learning workshop. New York: PMLR, 2015, 2: 1-8.
[32] VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[C]//Advances in Neural Information Processing Systems. Cambridge, Massachusetts: MIT Press, 2016, 29: 3637-3645.
[33] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017, 30: 4080-4090.
[34] SUNG F, YANG Y, ZHANG L, et al. Learning to compare: Relation network for few-shot learning[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 1199-1208.
[35] SANTORO A, BARTUNOV S, BOTVINICK M, et al. Meta-learning with memory-augmented neural networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2016: 1842–1850.
[36] MUNKHDALAI T, YU H. Meta networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 2554-2563.
[37] CAI Q, PAN Y, YAO T, et al. Memory matching networks for one-shot image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 4080-4088.
[38] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 1126-1135.
[39] RUSU A A, RAO D, SYGNOWSKI J, et al. Meta-learning with latent embedding optimization[EB/OL]. (2019-03-26)[2022-05-13]. https://arxiv.org/abs/1807.05960v3.
[40] LEE Y, CHOI S. Gradient-based meta-learning with learned layerwise metric and subspace[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2018: 2927-2936.
[41] Al-SHEDIVAT M, BANSAL T, BURDA Y, et al. Continuous adaptation via meta-learning in nonstationary and competitive environments[EB/OL]. (2018-02-23)[2022-05-13]. https://arxiv.org/abs/1710.03641.
[42] GRANT E, FINN C, LEVINE S, et al. Recasting gradient-based meta-learning as hierarchical bayes[EB/OL]. (2018-01-26)[2022-05-13]. https://arxiv.org/abs/1801.08930v1.
[43] NAGABANDI A, CLAVERA I, LIU S, et al. Learning to adapt in dynamic, real-world environments through meta-reinforcement learning[EB/OL]. (2019-02-27)[2022-05-13]. https://arxiv.org/abs/1803.11347v6.
[44] BENGIO Y, LOURADOUR J, COLLOBERT R, et al. Curriculum learning[C] //Proceedings of the International Conference on Machine Learning. New York: PMLR, 2009: 41-48.
[45] KUMAR M P, PACKER B, KOLLER D. Self-paced learning for latent variable models[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2010, 23: 1189-1197.
[46] JIANG L, MENG D Y, MITAMURA T, et al. Easy samples first: self-paced reranking for zero-example multimedia search[C]// Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2014: 547-556.
[47] ZHANG D W, MENG D Y, HAN J W. Co-saliency detection via a self-paced multiple-instance learning framework [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(5): 865-878.
[48] JIANG L, MENG D Y, ZHAO Q, et al. Self-paced curriculum learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015: 2694-2700.
[49] SUPANCIC J S, RAMANAN D. Self-paced learning for long-term tracking[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2013: 2379-2386.
[50] MENG D Y, ZHAO Q, JIANG L. A theoretical understanding of self-paced learning [J]. Information Sciences, 2017, 414: 319-328.
[51] MA Z L, LIU S Q, MENG D Y, et al. On convergence properties of implicit self-paced objective [J]. Information Sciences, 2018, 462: 132-140.
[52] LIU S Q, MA Z L, MENG D Y, et al. Understanding self-paced learning under concave conjugacy theory [J]. Communications in Information and Systems, 2018, 1(1): 1-27.
[53] 束俊, 孟德宇, 徐宗本. 元自步学习[J]. 中国科学: 信息科学, 2020, 50(6): 781-793.
SHU J, MENG D Y. XU Z B, Meta self-paced learning [J]. Scientia Sinica (Informationis), 2020, 50(6): 781-793.
[54] WANG X, CHEN Y D, ZHU W W. A survey on curriculum learning [J/OL]. IEEE Transactions on Pattern Analysis and Machine Intelligence. (2021-03-31)[2022-05-13]. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9392296.
[55] JIANG L, ZHOU Z, LEUNG T, et al. Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2018: 2304-2313.
[56] KIM T-H, CHOI J. Screenernet: learning self-paced curriculum for deep neural networks [EB/OL]. (2018-01-03)[2022-05-13]. https://arxiv.org/abs/1801.00904.
[57] WU L, TIAN F, XIA Y, et al. Learning to teach with dynamic loss functions[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2018, 31: 6467-6478.
[58] GUO S, HUANG W, ZHANG H, et al. Curriculumnet: weakly supervised learning from large-scale web images [C] // Computer Vision – ECCV 2018.Munich: European Conference on Computer Vision (ECCV), 2018: 135-150.
[59] SHU Y, CAO Z, LONG M, et al. Transferable curriculum for weakly-supervised domain adaptation[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 33(1): 4951-4958.
[60] TSVETKOV Y, FARUQUI M, LING W, et al. Learning the curriculum with bayesian optimization for task-specific word representation learning[C] //Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2016: 130-139.
[61] SAXENA S, TUZEL O, DECOSTE D. Data parameters: a new family of parameters for learning a differentiable curriculum[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2019, 32: 11095-11105.
[62] ZHANG X, KUMAR G, KHAYRALLAH H, et al. An empirical exploration of curriculum learning for neural machine translation[EB/OL]. (2018-11-02)[2022-05-13]. https://arxiv.org/abs/1811.00739.
[63] HACOHEN G, WEINSHALL D. On the power of curriculum learning in training deep networks[C]// Proceedings of the International Conference on Machine Learning. New York: PMLR, 2019: 2535-2544.
[64] WANG W, CASWELL I, CHELBA C. Dynamically composing domain-data selection with clean-data selection by "co-curricular learning" for neural machine translation[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 1282-1292.
[1] Xie Guo-bo, Lin Li, Lin Zhi-yi, He Di-xuan, Wen Gang. An Insulator Burst Defect Detection Method Based on YOLOv4-MP [J]. Journal of Guangdong University of Technology, 2023, 40(02): 15-21.
[2] Huang Jian-hang, Wang Zhen-you. A Research on Deep Learning Object Detection Algorithm Based on Feature Fusion [J]. Journal of Guangdong University of Technology, 2021, 38(04): 52-58.
[3] Ma Shao-peng, Liang Lu, Teng Shao-hua. A Lightweight Hyperspectral Remote Sensing Image Classification Method [J]. Journal of Guangdong University of Technology, 2021, 38(03): 29-35.
[4] Xia Hao, Cai Nian, Wang Ping, Wang Han. Magnetic Resonance Image Super-Resolution via Multi-Resolution Learning [J]. Journal of Guangdong University of Technology, 2020, 37(06): 26-31.
[5] Zhan Yin-wei, Zhu Bai-wan, Yang Zhuo. Research and Application of Vehicle Color and Model Recognition Algorithm [J]. Journal of Guangdong University of Technology, 2020, 37(04): 9-14.
[6] Zeng Bi-qing, Han Xu-li, Wang Sheng-yu, Xu Ru-yang, Zhou Wu. Sentiment Classification Based on Double Attention Convolutional Neural Network Model [J]. Journal of Guangdong University of Technology, 2019, 36(04): 10-17.
[7] Yang Meng-jun, Su Cheng-yue, Chen Jing, Zhang Jie-xin. Loop Closure Detection for Visual SLAM Using Convolutional Neural Networks [J]. Journal of Guangdong University of Technology, 2018, 35(05): 31-37.
[8] Chen Xu, Zhang Jun, Chen Wen-wei, Li Shuo-hao. Convolutional Neural Network Algorithm and Case [J]. Journal of Guangdong University of Technology, 2017, 34(06): 20-26.
[9] SHEN Xiao-Min, LI Bao-Jun, SUN Xu, XU Wei-Chao. Large Scale Face Clustering Based on Convolutional Neural Network [J]. Journal of Guangdong University of Technology, 2016, 33(06): 77-84.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!