Journal of Guangdong University of Technology ›› 2018, Vol. 35 ›› Issue (05): 20-25,50.doi: 10.12052/gdutxb.180023

Previous Articles     Next Articles

Multi-label Feature Selection Algorithm Based on ReliefF and Mutual Information

Chen Ping-hua1, Huang Hui1, Mai Miao2, Zhou Hong-hong3   

  1. 1. School of Computers, Guangdong University of Technology, Guangzhou 510006, China;
    2. Guangdong Nanfang Media Group, Guangzhou 510601, China;
    3. Guangdong Science and Technology Innovation Monitoring and Research Center, Guangzhou 510033, China
  • Received:2018-01-29 Online:2018-07-10 Published:2018-07-10

Abstract: In view of the problem that the traditional feature selection algorithm can not be applied to the multi-label learning context, a MML-RF algorithm is presented. The MML-RF improves the way of defining and searching nearest neighbor on the basis of the ReliefF, and introduces a new parameter to consider the contribution values of different labels. The improved weighting formula enables MML-RF to be used to the multi-label dataset. MML-RF algorithm makes use of mutual information as the measure of feature redundancy, and puts forward a solution to redundancy, which can get smaller subset of features. Experiments show that the feature subset of MML-RF is smaller, and has good classification effect on multi-label dataset, which can further enhance the efficiency of subsequent multi-label learning and data mining.

Key words: feature selection, multi-label learning, ReliefF, mutual information, feature redundancy

CLC Number: 

  • TP181
[1] ?O'LEARY D, KUBBY J. Feature selection and ANN solar power prediction[J/OL]. Journal of Renewable Energy, 2017, 2437387[2017-12-05]. https://doi.org/10.1155/2017/2437387.
[2] CHANDRASHEKAR G, SAHIN F. A survey on feature selection methods[J]. Computers & Electrical Engineering, 2014, 40(1):16-28
[3] 姚旭, 王晓丹, 张玉玺, 等. 特征选择方法综述[J]. 控制与决策, 2012, 27(2):161-166 YAO X, WANG X D, ZHANG Y X, et al. Summary of feature selection algorithms[J]. Control and Decision, 2012, 27(2):161-166
[4] 徐峻岭, 周毓明, 陈林, 等. 基于互信息的无监督特征选择[J]. 计算机研究与发展, 2012, 49(2):372-382 XU J L, ZHOU Y M, CHEN L, et al. An unsupervised feature selection approach based on mutual information[J]. Journal of Computer Research and Development, 2012, 49(2):372-382
[5] ROBNIK-ŠIKONJA M, KONONENKO I. Theoretical and empirical analysis of ReliefF and RReliefF[J]. Machine Learning, 2003, 53(1-2):23-69
[6] XIE Y, LI D, ZHANG D, et al. An improved multi-label Relief feature selection algorithm for unbalanced datasets[C]//Advances in Intelligent Systems and Interactive Applications.[S.l.]:Springer, 2017:141-151.
[7] FU Z, LU G, TING K M, et al. A survey of audio-based music classification and annotation[J]. IEEE Transactions on Multimedia, 2011, 13(2):303-319
[8] TANG J, ALELYANI S, LIU H. Feature selection for classification:a review[J]. Documentación Administrativa, 2014:313-334
[9] ZHANG M L, ZHOU Z H. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837
[10] KANJ S, ABDALLAH F, DENOEUX T, et al. Editing training data for multi-label classification with the k-nearest neighbor rule[J]. Pattern Analysis and Applications, 2016, 19(1):145-161
[11] QIU W R, ZHENG Q S, SUN B Q, et al. Multi-iPPseEvo:a multi-label classifier for identifying human phosphorylated proteins by incorporating evolutionary information into Chou's General PseAAC via Grey System Theory[J/OL]. Molecular Informatics, 2016, 36(3)[2017-11-25]. https://doi.org/10.1002/minf.201600085.
[12] 贺科达, 朱铮涛, 程昱. 基于改进TF-IDF算法的文本分类方法研究[J]. 广东工业大学学报, 2016, 33(5):49-53 HE K D, ZHU Z T, CHENG Y. A research on text classification method based on improved TF-IDF algorithm[J]. Journal of Guangdong University of Technology, 2016, 33(5):49-53
[13] ZHAO K, CHU W S, DE L T F, et al. Joint patch and multi-label learning for facial action unit detection[C]//Computer Vision and Pattern Recognition.[S.l.]:IEEE, 2015:2207-2216.
[14] WU B, ZHONG E, HORNER A, et al. Music emotions recognition by multi-label multi-layer multi-instance multi-view learning[C]//ACM International Conference on Multimedia.[S.l.]:ACM, 2014:117-126.
[15] CHEN G, YE D, XING Z, et al. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization[C]//International Joint Conference on Neural Networks.[S.l.]:IEEE, 2017:2377-2383.
[16] 陈平华, 周鹏. 一种应用于噪声点分布密集环境下的噪声点识别算法[J]. 广东工业大学学报, 2014, 31(3):39-43 CHEN P H, ZHOU P. A recognition algorithm of noise applied to environments with intensive noise-data distribution[J]. Journal of Guangdong University of Technology, 2014, 31(3):39-43
[17] 黄莉莉, 汤进, 孙登第, 等. 基于多标签ReliefF的特征选择算法[J]. 计算机应用, 2012, 32(10):2888-2890 HUANG L L, TANG J, SUN D D, et al. Feature selection algorithm based on multi-label ReliefF[J]. Journal of Computer Applications, 2012, 32(10):2888-2890
[18] VERGARA J R, ESTÉVEZ P A. A review of feature selection methods based on mutual information[J]. Neural Computing and Applications, 2014, 24(1):175-186
[19] 胡学钢, 许尧, 李培培, 等. 一种过滤式多标签特征选择算法[J]. 南京大学学报(自然科学版), 2015, 51(4):723-730 HU X G, XU Y, LI P P, et al. A fillter multi-label feature selection algorithm[J]. Journal of Nanjing University (Natural Sciences), 2015, 51(4):723-730
[20] TSOUMAKAS G, SPYROMITROS-XIOUFIS E, VILCEK J, et al. MULAN:a Java library for multi-label learning[J]. Journal of Machine Learning Research, 2011, 12(7):2411-2414
[21] CHERMAN E A, VALVERDE-REBAZA J, MONARD M C. Lazy multi-label learning algorithms based on mutuality strategies[J]. Journal of Intelligent & Robotic Systems, 2015, 80(1):261-276
[22] REYES O, MORELL C, Ventura S. Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context[J]. Neurocomputing, 2015(161):168-182
[23] LEE J, KIM D W. Mutual information-based multi-label feature selection using interaction information[J]. Expert Systems with Applications, 2015, 42(4):2013-2025
[24] RODRIGUES D, PEREIRA L A M, NAKAMURA R Y M, et al. A wrapper approach for feature selection based on bat algorithm and optimum-path forest[J]. Expert Systems with Applications, 2014, 41(5):2250-2258
[25] 张浩荣, 陈平华, 熊建斌. 基于蚁群模拟退火算法的云环境任务调度[J]. 广东工业大学学报, 2014, 31(3):77-82 ZHANG H R, CHEN P H, XIONG J B. Task scheduling algorithm based on simulated annealing ant colony algorithm in cloud computing environment[J]. Journal of Guangdong University of Technology, 2014, 31(3):77-82
[1] Zhang Wei, Zhang Zhen-bin. Joint Graph Embedding and Feature Weighting for Unsupervised Feature Selection [J]. Journal of Guangdong University of Technology, 2021, 38(05): 16-23.
[2] Tan You-xin, Teng Shao-hua. Combined Weighting Method for Short Text Features [J]. Journal of Guangdong University of Technology, 2020, 37(05): 51-61.
[3] Teng Shao-hua, Feng Zhen-ye, Teng Lu-yao, Fang Xiao-zhao. Joint Low-Rank Representation and Graph Embedding for Unsupervised Feature Selection [J]. Journal of Guangdong University of Technology, 2019, 36(05): 7-13.
[4] HE Ke-da, ZHU Zheng-tao, CHENG Yu. A Research on Text Classification Method Based on Improved TF-IDF Algorithm [J]. Journal of Guangdong University of Technology, 2016, 33(05): 49-53.
[5] ZHANG Hao. An Approach to Highdimensional Data Causality Inference [J]. Journal of Guangdong University of Technology, 2015, 32(1): 117-120.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!