广东工业大学学报 ›› 2021, Vol. 38 ›› Issue (05): 16-23.doi: 10.12052/gdutxb.210053

• • 上一篇    下一篇

联合图嵌入与特征加权的无监督特征选择

张巍, 张圳彬   

  1. 广东工业大学 计算机学院,广东 广州 510006
  • 收稿日期:2021-03-26 出版日期:2021-09-10 发布日期:2021-07-13
  • 作者简介:张巍(1964–),女,副教授,主要研究方向为网络安全、数据挖掘、协同计算和模式识别
  • 基金资助:
    广东省重点领域研发计划项目(2020B010166006);国家自然科学基金资助项目(61972102);广东省教育厅资助项目(粤教高函〔2018〕 179号,粤教高函〔2018〕 1号);广州市科技计划项目(201903010107,201802030011,201802010026,201802010042,201604046017)

Joint Graph Embedding and Feature Weighting for Unsupervised Feature Selection

Zhang Wei, Zhang Zhen-bin   

  1. School of Computers, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2021-03-26 Online:2021-09-10 Published:2021-07-13

摘要: 在特征选择领域, 现有的大多数方法不能同时捕获不同特征有差异的权重, 不能对投影子空间施加正交约束来提高特征的判别力。为此, 本文提出联合图嵌入与特征加权的无监督特征选择方法(Joint Graph Embedding and Feature Weighting, JGEFW)。首先, 通过图嵌入局部结构学习获得相似度矩阵和聚类指示矩阵; 然后利用正交回归获得表征不同特征重要程度的权重矩阵, 以此选择出判别力强且非冗余的特征。此外, 本文还提出了一个交替迭代优化算法来求解JGEFW模型; 最后, 在4个数据集上进行实验验证。实验结果表明, JGEFW的聚类指标在大多数情况下优于其他对比算法。

关键词: 特征选择, 特征权重, 无监督学习, 图嵌入

Abstract: In the area of feature selection, most existing methods cannot simultaneously capture the importance of different features and enforce orthogonal constraint on the projected subspace to improve the discrimination of selected features. To deal with this issue, a new unsupervised feature selection method called joint graph embedding and feature weighting (JGEFW) is proposed. To be specific, graph-embedding local structure learning is leveraged to obtain the similarity matrix and the cluster indicator matrix. Moreover, the weight matrix is learned through the orthogonal regression, to select the discriminative and non-redundant features. At last, an alternating iteration algorithm is developed for solving the proposed objective function. Finally, experimental results about clustering evaluation metrics on four publicly available datasets demonstrate that JGEFW is better than others in most cases.

Key words: feature selection, feature weight, unsupervised learning, graph embedding

中图分类号: 

  • TP391
[1] HU J, LI Y, GAO W, et al. Robust multi-label feature selection with dual-graph regularization [J]. Knowledge-Based Systems, 2020, 203: 106126.
[2] 费伦科, 秦建阳, 滕少华, 等. 近似最近邻大数据检索哈希散列方法综述[J]. 广东工业大学学报, 2020, 37(3): 23-35.
FEI L K, QIN J Y, TENG S H, et al. Hashing for approximate nearest neighbor search on big data: A survey [J]. Journal of Guangdong University of Technology, 2020, 37(3): 23-35.
[3] HOU C, NIE F, YI D, et al. Feature selection via joint embedding learning and sparse regression[C]//IJCAI International Joint Conference on Artificial Intelligence. Spain: Morgan Kaufmann, 2011: 1324-1329.
[4] WANG S, WANG H. Unsupervised feature selection via low-rank approximation and structure learning [J]. Knowledge-Based Systems, 2017, 124: 70-79.
[5] 刘艳芳, 李文斌, 高阳. 基于自适应邻域嵌入的无监督特征选择算法[J]. 计算机研究与发展, 2020, 57(8): 1639-1649.
LIU Y F, LI W B, GAO Y. Adaptive neighborhood embedding based unsupervised feature selection [J]. Journal of Computer Research and Development, 2020, 57(8): 1639-1649.
[6] 滕少华, 冯镇业, 滕璐瑶, 等. 联合低秩表示与图嵌入的无监督特征选择[J]. 广东工业大学学报, 2019, 36(5): 7-13.
TENG S H, FENG Z Y, TENG L Y, et al. Joint low-rank representation and graph embedding for unsupervised feature selection [J]. Journal of Guangdong University of Technology, 2019, 36(5): 7-13.
[7] ZHENG W, YAN H, YANG J. Robust unsupervised feature selection by nonnegative sparse subspace learning [J]. Neurocomputing, 2019, 334: 156-171.
[8] DING D, YANG X, XIA F, et al. Unsupervised feature selection via adaptive hypergraph regularized latent representation learning [J]. Neurocomputing, 2020, 378: 79-97.
[9] TENG L, FENG Z, FANG X, et al. Unsupervised feature selection with adaptive residual preserving [J]. Neurocomputing, 2019, 367: 259-272.
[10] LIU X, WANG L, ZHANG J, et al. Global and local structure preservation for feature selection [J]. IEEE Transactions on Neural Networks and Learning Systems, IEEE, 2014, 25(6): 1083-1095.
[11] DU L, SHEN Y D. Unsupervised feature selection with adaptive structure learning[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Sydney: ACM, 2015-August: 209-218.
[12] CAI D, ZHANG C, HE X. Unsupervised feature selection for multi-cluster data[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2010: 333-342.
[13] QIAN M, ZHAI C. Robust unsupervised feature selection[C]//IJCAI International Joint Conference on Artificial Intelligence. Beijing: Morgan Kaufmann, 2013: 1621-1627.
[14] LI Z, YANG Y, LIU J, et al. Unsupervised feature selection using nonnegative spectral analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Toronto: AAAI, 2012: 1026-1032.
[15] NIE F, WANG X, HUANG H. Clustering and projected clustering with adaptive neighbors[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 977-986.
[16] XU X, WU X, WEI F, et al. A general framework for feature selection under orthogonal regression with global redundancy minimization [J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 99: 1-1.
[17] ZHAO H, WANG Z, NIE F. Orthogonal least squares regression for feature extraction [J]. Neurocomputing, 2016, 216: 200-207.
[18] WU X, XU X, LIU J, et al. Supervised feature selection with orthogonal regression and feature weighting [J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(5): 1831-1838.
[19] NIE F, ZHANG R, LI X. A generalized power iteration method for solving quadratic problem on the Stiefel manifold [J]. Science China(Information Sciences), 2017, 60(11): 5-11.
[20] HUANG J, NIE F, HUANG H. A new simplex sparse learning model to measure data similarity for clustering[C]//IJCAI International Joint Conference on Artificial Intelligence. Buenos Aires: Morgan Kaufmann, 2015: 3569-3575.
[21] LIU Y, YE D, LI W, et al. Robust neighborhood embedding for unsupervised feature selection [J]. Knowledge-Based Systems, 2020, 193: 105462.
[22] LI X, ZHANG H, ZHANG R, et al. Generalized uncorrelated regression with adaptive graph for unsupervised feature selection [J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(5): 1587-1595.
[23] THARWAT A, GABER T, IBRAHIM A, et al. Linear discriminant analysis: a detailed tutorial [J]. AI Communications, 2017, 30(2): 169-190.
[24] SHANG R, WANG W, STOLKIN R, et al. Non-negative spectral learning and sparse regression-based dual-graph regularized feature selection [J]. IEEE Transactions on Cybernetics, IEEE, 2018, 48(2): 793-806.
[1] 谭有新, 滕少华. 短文本特征的组合加权方法[J]. 广东工业大学学报, 2020, 37(05): 51-61.
[2] 岑仕杰, 何元烈, 陈小聪. 结合注意力与无监督深度学习的单目深度估计[J]. 广东工业大学学报, 2020, 37(04): 35-41.
[3] 滕少华, 冯镇业, 滕璐瑶, 房小兆. 联合低秩表示与图嵌入的无监督特征选择[J]. 广东工业大学学报, 2019, 36(05): 7-13.
[4] 陈平华, 黄辉, 麦淼, 周宏虹. 结合ReliefF和互信息的多标签特征选择算法[J]. 广东工业大学学报, 2018, 35(05): 20-25,50.
[5] 贺科达, 朱铮涛, 程昱. 基于改进TF-IDF算法的文本分类方法研究[J]. 广东工业大学学报, 2016, 33(05): 49-53.
[6] 蒋盛益, 王连喜. 聚类分析研究的挑战性问题[J]. 广东工业大学学报, 2014, 31(3): 32-38.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!