广东工业大学学报 ›› 2022, Vol. 39 ›› Issue (05): 1-8.doi: 10.12052/gdutxb.220092
• • 下一篇
章云, 王晓东
Zhang Yun, Wang Xiao-dong
摘要: 深度学习目前依靠大数据和强算力取得了较大进展,但在样本受限情况下的表现差强人意,主要问题在于函数空间(簇)的建构和在数据集受限情况下算法的设计。据此,本文对受限样本下的深度学习进行了分类综述。另外,从目前对大脑的研究来看,人的认知过程在大脑中是分区域的,每个区域担负的功能是不同的,对每个区域功能的学习过程也应该是有差异的。因此,提出了“功能进阶”式的深度学习的设想,试图构建分区分层多种功能模块组成的网络结构,研究“进阶”式的功能模块训练方法,以期探求“仿人学习”的新路径。
中图分类号:
[1] 徐宗本. 人工智能的十个重大数理基础问题[J]. 中国科学:信息科学, 2021, 51(12): 1967-1978. XU Z B. Ten fundamental problems for artificial intelligence: mathematical and physical aspects [J]. Scientia Sinica(Informationis), 2021, 51(12): 1967-1978. [2] WANG Y, YAO Q, KWOK J T, et al. Generalizing from a few examples: a survey on few-shot learning [J]. ACM Computing Surveys (Csur), 2020, 53(3): 1-34. [3] LIU B, WANG X D, DIXIT M, et al. Feature space transfer for data augmentation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 9090-9098. [4] TSAI Y H H, SALAKHUTDINOV R. Improving one-shot learning through fusing side information[EB/OL]. (2018-01-23)[2022-05-13]. https://arxiv.org/abs/1710.08347. [5] TRIANTAFILLOU E, ZEMEL R, URTASUN R. Few-shot learning through an information retrieval lens[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017, 30: 2252-2262. [6] YAN L M, ZHENG Y H, CAO J. Few-shot learning for short text classification [J]. Multimedia Tools and Applications, 2018, 77(22): 29799-29810. [7] RAVI S, LAROCHELLE H. Optimization as a model for few-shot learning[C] //Proceedings of the International Conference on Learning Representations. Cambridge: ICLR, 2017: 1-11. [8] YOO D, FAN H, BODDETI V N, et al. Efficient k-shot learning with regularized deep networks[EB/OL]. (2017-10-06) [2022-05-13]. https://arxiv.org/abs/1710.02277. [9] HONG J, FANG P, LI W, et al. Reinforced attention for few-shot learning and beyond [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2021: 913-923. [10] ELSKEN T, STAFFLER B, METZEN J H, et al. Meta-learning of neural architectures for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2020: 12365-12375. [11] PAN S J, YANG Q. A survey on transfer learning [J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 22(10): 1345-1359. [12] ZHUANG F, QI Z, DUAN K, et al. A comprehensive survey on transfer learning [J]. Proceedings of the IEEE, 2020, 109(1): 43-76. [13] JIANG J, ZHAI C. Instance weighting for domain adaptation in NLP[C]// Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2007: 264-271. [14] DAI W, YANG Q, XUE G R, et al. Boosting for transfer learning[C]// Proceedings of the International Conference on Machine Learning. New York: PMLR, 2007: 193-200. [15] SUGIYAMA M, NAKAJIMA S, KASHIMA H, et al. Direct importance estimation with model selection and its application to covariate shift adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 1433-1440. [16] SAITO K, USHIKU Y, HARADA T. Asymmetric tri-training for unsupervised domain adaptation[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 2988-2997. [17] DUAN L, TSANG I W, XU D. Domain transfer multiple kernel learning [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(3): 465-479. [18] ZHUANG F, LUO P, YIN P, et al. Concept learning for cross-domain text classification: a general probabilistic framework[C]//Twenty-Third International Joint Conference on Artificial Intelligence. Amsterdam: Elsevier, 2013: 1960-1966. [19] BONILLA E V, CHAI K, WILLIAMS C. Multi-task Gaussian process prediction[C] //Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 153-160. [20] SCHWAIGHOFER A, TRESP V, YU K. Learning Gaussian process kernels via hierarchical Bayes[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2004, 17: 1209-1216. [21] MIHALKOVA L, MOONEY R J. Transfer learning by mapping with minimal target data[C] //Proceedings of the AAAI-08 Workshop on Transfer Learning for Complex Tasks. Palo Alto: AAAI Press, 2008: 1163-1168. [22] DAVIS J, DOMINGOS P. Deep transfer via second-order markov logic [C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2009: 217-224. [23] TAN B, SONG Y, ZHONG E, et al. Transitive transfer learning[C] //Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2015: 1155-1164. [24] TAN B, ZHANG Y, PAN S, et al. Distant domain transfer learning [C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017, 31: 2604-2610. [25] YOU K C, LIU Y, WANG J M, et al. Logme: practical assessment of pre-trained models for transfer learning[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2021: 12133-12143. [26] BEN-DAVID S, BLITZER J, CRAMMER K, et al. Analysis of representations for domain adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2006, 19: 137-144. [27] BLITZER J, CRAMMER K, KULESZA A, et al. Learning bounds for domain adaptation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2007, 20: 129-136. [28] BEN-DAVID S, BLITZER J, CRAMMER K, et al. A theory of learning from different domains [J]. Machine learning, 2010, 79(1): 151-175. [29] HUISMAN M, VANRIJN J N, PLAAT A. A survey of deep meta-learning [J]. Artificial Intelligence Review, 2021, 54(6): 4483-4541. [30] HOSPEDALES T, ANTONIOU A, MICAELLI P, et al. Meta-learning in neural networks: a survey[EB/OL]. https://arxiv.org/abs/2004.05439. [31] KOCH G, ZEMEL R, SALAKHUTDINOV R. Siamese neural networks for one-shot image recognition[C] //ICML deep learning workshop. New York: PMLR, 2015, 2: 1-8. [32] VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[C]//Advances in Neural Information Processing Systems. Cambridge, Massachusetts: MIT Press, 2016, 29: 3637-3645. [33] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017, 30: 4080-4090. [34] SUNG F, YANG Y, ZHANG L, et al. Learning to compare: Relation network for few-shot learning[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 1199-1208. [35] SANTORO A, BARTUNOV S, BOTVINICK M, et al. Meta-learning with memory-augmented neural networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2016: 1842–1850. [36] MUNKHDALAI T, YU H. Meta networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 2554-2563. [37] CAI Q, PAN Y, YAO T, et al. Memory matching networks for one-shot image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 4080-4088. [38] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2017: 1126-1135. [39] RUSU A A, RAO D, SYGNOWSKI J, et al. Meta-learning with latent embedding optimization[EB/OL]. (2019-03-26)[2022-05-13]. https://arxiv.org/abs/1807.05960v3. [40] LEE Y, CHOI S. Gradient-based meta-learning with learned layerwise metric and subspace[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2018: 2927-2936. [41] Al-SHEDIVAT M, BANSAL T, BURDA Y, et al. Continuous adaptation via meta-learning in nonstationary and competitive environments[EB/OL]. (2018-02-23)[2022-05-13]. https://arxiv.org/abs/1710.03641. [42] GRANT E, FINN C, LEVINE S, et al. Recasting gradient-based meta-learning as hierarchical bayes[EB/OL]. (2018-01-26)[2022-05-13]. https://arxiv.org/abs/1801.08930v1. [43] NAGABANDI A, CLAVERA I, LIU S, et al. Learning to adapt in dynamic, real-world environments through meta-reinforcement learning[EB/OL]. (2019-02-27)[2022-05-13]. https://arxiv.org/abs/1803.11347v6. [44] BENGIO Y, LOURADOUR J, COLLOBERT R, et al. Curriculum learning[C] //Proceedings of the International Conference on Machine Learning. New York: PMLR, 2009: 41-48. [45] KUMAR M P, PACKER B, KOLLER D. Self-paced learning for latent variable models[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2010, 23: 1189-1197. [46] JIANG L, MENG D Y, MITAMURA T, et al. Easy samples first: self-paced reranking for zero-example multimedia search[C]// Proceedings of the ACM International Conference on Multimedia. New York: ACM, 2014: 547-556. [47] ZHANG D W, MENG D Y, HAN J W. Co-saliency detection via a self-paced multiple-instance learning framework [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(5): 865-878. [48] JIANG L, MENG D Y, ZHAO Q, et al. Self-paced curriculum learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2015: 2694-2700. [49] SUPANCIC J S, RAMANAN D. Self-paced learning for long-term tracking[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2013: 2379-2386. [50] MENG D Y, ZHAO Q, JIANG L. A theoretical understanding of self-paced learning [J]. Information Sciences, 2017, 414: 319-328. [51] MA Z L, LIU S Q, MENG D Y, et al. On convergence properties of implicit self-paced objective [J]. Information Sciences, 2018, 462: 132-140. [52] LIU S Q, MA Z L, MENG D Y, et al. Understanding self-paced learning under concave conjugacy theory [J]. Communications in Information and Systems, 2018, 1(1): 1-27. [53] 束俊, 孟德宇, 徐宗本. 元自步学习[J]. 中国科学: 信息科学, 2020, 50(6): 781-793. SHU J, MENG D Y. XU Z B, Meta self-paced learning [J]. Scientia Sinica (Informationis), 2020, 50(6): 781-793. [54] WANG X, CHEN Y D, ZHU W W. A survey on curriculum learning [J/OL]. IEEE Transactions on Pattern Analysis and Machine Intelligence. (2021-03-31)[2022-05-13]. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9392296. [55] JIANG L, ZHOU Z, LEUNG T, et al. Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels[C]//Proceedings of the International Conference on Machine Learning. New York: PMLR, 2018: 2304-2313. [56] KIM T-H, CHOI J. Screenernet: learning self-paced curriculum for deep neural networks [EB/OL]. (2018-01-03)[2022-05-13]. https://arxiv.org/abs/1801.00904. [57] WU L, TIAN F, XIA Y, et al. Learning to teach with dynamic loss functions[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2018, 31: 6467-6478. [58] GUO S, HUANG W, ZHANG H, et al. Curriculumnet: weakly supervised learning from large-scale web images [C] // Computer Vision – ECCV 2018.Munich: European Conference on Computer Vision (ECCV), 2018: 135-150. [59] SHU Y, CAO Z, LONG M, et al. Transferable curriculum for weakly-supervised domain adaptation[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 33(1): 4951-4958. [60] TSVETKOV Y, FARUQUI M, LING W, et al. Learning the curriculum with bayesian optimization for task-specific word representation learning[C] //Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2016: 130-139. [61] SAXENA S, TUZEL O, DECOSTE D. Data parameters: a new family of parameters for learning a differentiable curriculum[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2019, 32: 11095-11105. [62] ZHANG X, KUMAR G, KHAYRALLAH H, et al. An empirical exploration of curriculum learning for neural machine translation[EB/OL]. (2018-11-02)[2022-05-13]. https://arxiv.org/abs/1811.00739. [63] HACOHEN G, WEINSHALL D. On the power of curriculum learning in training deep networks[C]// Proceedings of the International Conference on Machine Learning. New York: PMLR, 2019: 2535-2544. [64] WANG W, CASWELL I, CHELBA C. Dynamically composing domain-data selection with clean-data selection by "co-curricular learning" for neural machine translation[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 1282-1292. |
[1] | 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21. |
[2] | 黄剑航, 王振友. 基于特征融合的深度学习目标检测算法研究[J]. 广东工业大学学报, 2021, 38(04): 52-58. |
[3] | 马少鹏, 梁路, 滕少华. 一种轻量级的高光谱遥感图像分类方法[J]. 广东工业大学学报, 2021, 38(03): 29-35. |
[4] | 夏皓, 蔡念, 王平, 王晗. 基于多分辨率学习卷积神经网络的磁共振图像超分辨率重建[J]. 广东工业大学学报, 2020, 37(06): 26-31. |
[5] | 战荫伟, 朱百万, 杨卓. 车辆颜色和型号识别算法研究与应用[J]. 广东工业大学学报, 2020, 37(04): 9-14. |
[6] | 曾碧卿, 韩旭丽, 王盛玉, 徐如阳, 周武. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报, 2019, 36(04): 10-17. |
[7] | 杨孟军, 苏成悦, 陈静, 张洁鑫. 基于卷积神经网络的视觉闭环检测研究[J]. 广东工业大学学报, 2018, 35(05): 31-37. |
[8] | 陈旭, 张军, 陈文伟, 李硕豪. 卷积网络深度学习算法与实例[J]. 广东工业大学学报, 2017, 34(06): 20-26. |
[9] | 申小敏, 李保俊, 孙旭, 徐维超. 基于卷积神经网络的大规模人脸聚类[J]. 广东工业大学学报, 2016, 33(06): 77-84. |
|