广东工业大学学报 ›› 2021, Vol. 38 ›› Issue (03): 22-28,47.doi: 10.12052/gdutxb.200120

• • 上一篇    下一篇

半监督两个视角的多示例聚类模型

蔡昊, 刘波   

  1. 广东工业大学 自动化学院,广东 广州 510006
  • 收稿日期:2020-09-17 出版日期:2021-05-10 发布日期:2021-03-12
  • 通信作者: 刘波(1978-),男,教授,博士,主要研究方向为机器学习、数据挖掘,E-mail:csbliu@gmail.com E-mail:csbliu@gmail.com
  • 作者简介:蔡昊(1992-),男,博士研究生,主要研究方向为数据挖掘,E-mail:caiyh9658@163.com
  • 基金资助:
    国家自然科学基金资助项目(61876044)

A Semi-supervised Two-view Multiple-Instance Clustering Model

Cai Hao, Liu Bo   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2020-09-17 Online:2021-05-10 Published:2021-03-12

摘要: 提出了一种新的半监督两个视角的多示例聚类模型, 整合文本视角和图像视角解决了伴有少量标签的多示例图像聚类问题。提出的模型首先嵌入概念分解和多示例核成为一个整体, 学习每个视角的关联矩阵和两个视角所共享的聚类指示矩阵。而后, 应用${l_{2, 1}}$范数学习最优的关联矩阵和聚类指示矩阵。进一步地, 为了增加包之间的判别力, 提出的模型强迫相同标签包的聚类指示向量间的相似性趋于1, 不同标签包的指示向量间的相似性趋于0。最后, 给出一种迭代更新算法优化提出的模型。实验结果表明,提出的模型优于现有的多示例聚类模型。

关键词: 多示例学习, 多视角学习, 概念分解, 多示例核函数

Abstract: A novel semi-supervised two-view multi-instance clustering model is proposed, which bands text-view with image-view and solves the multi-instance image clustering problem with a small amount of label. Firstly, the proposed model embeds Concept Factorization and multi-instance kernel into a joint framework, which learns the association matrix of each view and the cluster indicator matrix shared by both views. Then, a ${l_{2, 1}}$-norm is applied to learn the optimal association matrix and cluster indicator matrix. Furthermore, to enhance the discriminability between bags, the proposed model enforces the similarity of the cluster indicators for the bag with the same label to approximate 1 and the similarity with different labels to 0. Finally, an iterative updating algorithm is derived to solve the proposed model. The experimental results show that the proposed model is superior to other multi-instance clustering models.

Key words: multi-instance learning, multi-view learning, concept factorization, multi-instance kernel function

中图分类号: 

  • TP391.4
[1] TIAN M W, YAN S R, TIAN X X, et al. Research on image recognition method of bank financing bill based on binary tree decision [J]. Journal of Visual Communication and Image Representation, 2019, 60: 123-128.
[2] WANG P, ZHANG P F, LI Z W. A three-way decision method based on gaussian kernel in a hybrid information system with images: an application in medical diagnosis [J]. Applied Soft Computing, 2019, 77: 734-749.
[3] REN Y Z, WANG N, LI M X, et al. Deep density-based image clustering [J]. Knowledge-Based Systems, 2020, 197(7): 105841.
[4] YANG Z Y, ZHANG Y, XIANG Y, et al. Non-negative matrix factorization with dual constraints for image clustering [J]. IEEE Transactions on Systems Man & Cybernetics Systems, 2018, 50(7): 1-10.
[5] 黎启祥, 肖燕珊, 郝志峰, 等. 基于抗噪声的多任务多示例学习算法研究[J]. 广东工业大学学报, 2018, 35(3): 47-53.
LI Q X, XIAO Y S, HAO Z F, et al. An algorithm based on multi-instance anti-noise learning [J]. Journal of Guangdong University of Technology, 2018, 35(3): 47-53.
[6] ZHANG D, WANG F, SI L, et al. Maximum margin multiple instance clustering with applications to image and text clustering [J]. IEEE Transactions on Neural Networks, 2011, 22(5): 739-751.
[7] XU W, GONG Y H. Document clustering by concept factorization[C]//Proceedings of the International ACM Sigir Conference on Research and Development in Information Retrieval. Sheffield: ACM, 2004: 202-209.
[8] YANG Y, WANG H. Multi-view clustering: a survey [J]. Big Data Mining & Analytics, 2018, 1(2): 3-27.
[9] ZHOU W, WANG H, YANG Y. Consensus graph learning for incomplete multi-view clustering[C]// Proceedings of the 23rd Pacific-asia Conference on Knowledge Discovery and Data Mining. Macau: ACM, 2019: 529-540.
[10] CAO X C, ZHANG C Q, FU H Z, et al. Diversity-induced multi-view subspace clustering[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 586-594.
[11] WANG J, TIAN F, YU H C, et al. Diverse non-negative matrix factorization for multi-view data representation [J]. IEEE Transactions on Cybernetics, 2018, 48(9): 1-13.
[12] LIU J, JIANG Y, LI Z C, et al. Partially shared latent factor learning with multiview data [J]. IEEE Transactions on Neural Networks, 2015, 26(6): 1233-1246.
[13] CARBONNEAU M A, CHEPLYGINA V, GRANGER E, et al. Multiple instance learning: a survey of problem characteristics and applications [J]. Pattern Recognition, 2017, 77: 329-353.
[14] MELKI G, CANO A, VENTURA S. Mirsvm: multi-instance support vector machine with bag representatives [J]. Pattern Recognition, 2018, 79: 228-241.
[15] ANDERWS S, TSOCHANTARIDIS I, HOFMANN T. Support vector machines for multiple-instance learning[C]//Proceedings of the Neural Information Processing Systems. Vancouver: Nips, 2003: 577-584.
[16] WANG H Y, YANG Q, ZHA H B. Adaptive p-posterior mixture-model kernels for multiple instance learning[C]//Proceedings of the International Conference on Machine Learning. Helsinki: ACM, 2008: 1136-1143.
[17] GARTNER T, FLACH P A, KOWALCZYK A, et al. Multi-instance kernels[C]//Proceedings of the International Conference on Machine Learning. Sydney: ACM, 2002: 179-186.
[18] ZHANG M L, ZHOU Z H. Multi-instance clustering with applications to multi-instance prediction [J]. Applied Intelligence, 2009, 31(1): 47-68.
[19] Chua T S, Tang J H, Hong R C, et al. Nus-wide: a real-world web image database from national university of singapore[C]//Proceedings of the ACM International Conference on Image and Video Retrieval. Santorini: ACM, 2009: 368-375.
[20] WEI X S, ZHOU Z H. An empirical study on image bag generators for multi-instance learning [J]. Machine Learning, 2016, 105(2): 155-198.
[1] 黎启祥, 肖燕珊, 郝志峰, 阮奕邦. 基于抗噪声的多任务多示例学习算法研究[J]. 广东工业大学学报, 2018, 35(03): 47-53.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!