广东工业大学学报 ›› 2018, Vol. 35 ›› Issue (03): 47-53.doi: 10.12052/gdutxb.180036

• 综合研究 • 上一篇    下一篇

基于抗噪声的多任务多示例学习算法研究

黎启祥1, 肖燕珊1, 郝志峰2, 阮奕邦1   

  1. 1. 广东工业大学 计算机学院, 广东 广州 510006;
    2. 佛山科学技术学院 数学与大数据学院, 广东 佛山 528000
  • 收稿日期:2018-03-05 出版日期:2018-05-09 发布日期:2018-04-26
  • 通信作者: 肖燕珊(1981-),女,教授,博士,主要研究方向为机器学习、图像处理与模式识别.E-mail:xiaoyanshan@189.cn E-mail:xiaoyanshan@189.cn
  • 作者简介:黎启祥(1992-),男,硕士研究生,研究方向为机器学习、数据挖掘、图像处理.
  • 基金资助:
    国家自然科学基金资助项目(61472090,61672169,61472089)

An Algorithm Based on Multi-task Multi-instance Anti-noise Learning

Li Qi-xiang1, Xiao Yan-shan1, Hao Zhi-feng2, Ruan Yi-bang1   

  1. 1. School of Computers, Guangdong University of Technology, Guangzhou 510006, China;
    2. School of Mathematics and Big Data, Foshan University, Foshan 528000, China
  • Received:2018-03-05 Online:2018-05-09 Published:2018-04-26

摘要: 在多示例学习中,当训练样本数量不充足或者训练样本中存在噪声信息时,分类器的分类性能将降低.针对该问题,本文提出了一种基于抗噪声的多任务多示例学习算法.一方面,针对训练样本中可能存在的噪声问题,该算法赋予包中示例不同的权值,通过迭代更新权值来降低噪声数据对预测结果的影响.另一方面,针对训练样本数量不充足问题,该算法运用多任务学习策略,通过同时训练多个学习任务,利用任务间的关联性来提高各个分类任务的预测性能.实验结果证明,与现有的分类算法相比,该方法在相同的实验条件下具有更优秀的性能.

关键词: 多示例学习, 抗噪声, 多任务学习, 关联性, 分类器

Abstract: In multi-instance learning, classification performance may be limited due to the noisy data or a scarce amount of labeled data. To solve this problem, an algorithm based on multi-task multi-instance anti-noise learning is proposed. On the one hand, in view of the noisy data, the algorithm trains a classifier by assigning the instances in bags with different weights. And the weights of instances are updated by adopting an iterative optimization framework which decreases the influence of the noisy data. On the other hand, in view of insufficient labeled data, the classifier is extended to multi-task learning to train multiple learning tasks at the same time, so that the performance of each learning task can be improved by sharing the classification information among the tasks. Extensive experiments have showed that the proposed classification framework outperforms the existing classification methods.

Key words: multi-instance learning, anti-noise, multi-task learning, correlation, classification

中图分类号: 

  • TP301.6
[1] SCHÖLKOPF B, PLATT J, HOFMANN T. Multi-instance multi-label learning with application to scene classification[C]//NIPS'06 Proceedings of the 19th International Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2006:1609-1616.
[2] DIETTERICH T G, LATHROP R H, LOZANO-PÉREZ T. Solving the multiple-instance problem with axis-parallel rectangles[J]. Artificial Intelligence, 1997, 89(1-2):31-71.
[3] CHINNIYAN K, GANGADHARAN S, SABANAIKAM K. Semantic similarity based web document classification using support vector machine[J]. International Arab Journal of Information Technology, 2017, 14(3):258-292.
[4] FENG S, XU D. Transductive multi-Instance multi-Label learning algorithm with application to automatic image annotation[J]. Expert Systems with Applications, 2010, 37(1):661-670.
[5] 梁礼欣, 郝志峰, 蔡瑞初, 等. 基于混合高斯分布伪样本生成的情感分析方法[J]. 广东工业大学学报, 2016, 33(6):85-90.LIANG L X, HAO Z F, CAI R C, et al. An approach to sentiment analysis of Chinese microblogs based on Gaussian mixture distribution pseudo-sample generation[J]. Journal of Guangdong University of Technology, 2016, 33(6):85-90.
[6] 陈炳丰, 郝志峰, 蔡瑞初, 等. 面向汽车评论的细粒度情感分析方法研究[J]. 广东工业大学学报, 2017, 34(3):8-14.CHEN B F, HAO Z F, CAI R C, et al. A fine-grained sentiment analysis algorithm for automotive reviews[J]. Journal of Guangdong University of Technology, 2017, 34(3):8-14.
[7] CHEN D Y, LIN K Y. Face-based multiple instance analysis for smart electronics billboard[J]. Multimedia Tools and Applications, 2012, 59(1):221-240.
[8] 杨帆, 李建平, 李鑫, 等. 基于多任务深度卷积神经网络的显著性对象检测算法[J]. 计算机应用, 2018, 38(1):91-96.YANG F, LI J P, LI X, et al. Salient object detection algorithm based on multi-task deep convolutional neural network[J]. Journal of Computer Applications, 2018, 38(1):91-96.
[9] ANDREWS S, TSOCHANTARIDIS I, HOFMANN T. Support vector machines for multiple-instance learning[J]. Advances in Neural Information Processing Systems, 2003, 15(2):561-568.
[10] ZHANG Q, GOLDMAN S A. EM-DD:an improved multiple-instance learning technique[C]//International Conference on Neural Information Processing Systems:Natural and Synthetic. Vancouver:MIT Press, 2001:1073-1080.
[11] WANG J, ZUCKER J D. Solving the multiple-instance problem:a lazy learning approach[C]//Seventeenth International Conference on Machine Learning.San Francisco:Morgan Kaufmann, 2000:1119-1126.
[12] 贾松达, 庞宇松, 阎高伟. 多任务LS-SVM在时间序列预测中的应用[J]. 计算机工程与应用, 2018, 54(3):233-237.JIA S D, PANG Y S, YAN G W, et al. Multi-task LS-SVM for application of time series prediction[J]. Computer Engineering and Applications, 2018, 54(3):233-237.
[13] EVGENIOU T, PONTIL M. Regularized multi-task learning[C]//Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle:ACM, 2004:109-117.
[14] Chih-Chung Chang, Chih-jen Lin. LIBSVM:A library for support vector machines[J]. Journal ACM Transactions on Intelligent Systems and Technology, 2011, 2(3):1-27.
[15] 刘成, 彭进业. 基于多任务学习的自然图像分类研究[J]. 计算机应用研究, 2012, 29(7):2773-2775.LIU C, PENG J Y. Research of classification method for natural images based on multitask learning[J]. Application Research of Computers, 2012, 29(7):2773-2775.
[16] AGGARWAL C C, YU P. A framework for clustering uncertain data streams[C]//IEEE International Conference on Data Engineering. Cancun:IEEE, 2008:150-159.
[17] ZENG T, JI S W. Deep convolutional neural networks for multi-instance multi-task learning[C]//IEEE International Conference on Data Mining. Atlantic:IEEE, 2015:579-588.
[1] 蔡昊, 刘波. 半监督两个视角的多示例聚类模型[J]. 广东工业大学学报, 2021, 38(03): 22-28,47.
[2] 冯广, 孔立斌, 石鸣鸣, 贺敏慧, 何雅萱. 基于Inception与Residual组合网络的农作物病虫害识别[J]. 广东工业大学学报, 2020, 37(03): 17-22.
[3] 肖河曼, 成思源, 杨雪荣, 李苏洋, 张海燕. 基于功能相似矩阵的异类产品专利规避设计[J]. 广东工业大学学报, 2018, 35(05): 5-10.
[4] 李俊磊, 滕少华, 张巍. 基于决策树组合分类器的气温预测[J]. 广东工业大学学报, 2014, 31(4): 54-59.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!