广东工业大学学报 ›› 2019, Vol. 36 ›› Issue (05): 14-19.doi: 10.12052/gdutxb.180146

• 综合研究 • 上一篇    下一篇

基于低阶条件独立测试的因果网络结构学习方法

洪英汉1,2, 郝志峰1,3, 麦桂珍1, 陈平华1   

  1. 1. 广东工业大学计算机学院, 广东广州 510006;
    2. 韩山师范学院物理与电子工程学院, 广东潮州 521041;
    3. 佛山科学技术学院数学与大数据学院, 广东佛山 528000
  • 收稿日期:2018-11-05 出版日期:2019-08-21 发布日期:2019-08-06
  • 通信作者: 麦桂珍(1985-),女,博士,主要研究方向为因果关系、机器学习.E-mail:mgz0323@126.com E-mail:mgz0323@126.com
  • 作者简介:洪英汉(1984-),男,副教授,博士研究生,主要研究方向为因果关系、机器学习、云计算、数据挖掘及其应用.
  • 基金资助:
    国家自然科学基金资助项目(61472089,61572144);广东省科技计划项目(2015A030101101,2015B090922014,2017A040405063,2017B030307002)

Learning Causal Skeleton by Using Lower Order Conditional Independent Tests

Hong Ying-han1,2, Hao Zhi-feng1,3, Mai Gui-zhen1, Chen Ping-hua1   

  1. 1. School of Computers, Guangdong University of Technology, Guangzhou 510006, China;
    2. School of Physice & Electronic Engineering, Hanshan Normal University, Chaozhou 521041, China;
    3. School of Mathematics and Big Data, Foshan University, Foshan 528000, China
  • Received:2018-11-05 Online:2019-08-21 Published:2019-08-06

摘要: 基于条件约束的方法可从数据集中学习到变量间的因果关系,并构建出因果网络图.但是在高维数据情况下,基于条件约束方法的缺点是准确率较低且耗时多,从而严重影响此类方法在高维数据中的应用推广.因此,本文提出了一种基于低阶条件独立测试的因果网络结构学习方法,采用低阶条件独立测试来加速构建因果粗糙骨架;利用分裂?合并策略把高维网络分裂成若干个子网络,并进行因果网络结构学习以提高其准确率;最后整合成完整的因果网络图.实验结果均验证了该方法的可行性.

关键词: 因果结构学习, 高维数据, 低阶, 条件独立测试

Abstract: The causal relationships between variables from dataset and the corresponding causal network can be recovered by the constraint-based methods. However, in high-dimensional dataset, the accuracy and efficiency of the constraint-based methods are not high, which seriously affects the application and promotion of such methods in high-dimensional dataset. In order to solve these problems, a causal network structure learning method based on low-order conditional independent (CI) test is proposed. In this method, CI test based on lower order conditional set is used to construct a rough causality skeleton; and the split-merge strategy is taken to divide the large rough causality skeleton into a set of smaller subnetworks. The structure of each subnetwork of lower dimensionality is constructed independently, thus able to improve its accuracy. Finally, a complete causality network graph is integrated. The experimental results demonstrate the technical feasibility.

Key words: causal inference structure learning, high dimensional data, low order, conditional independent testing

中图分类号: 

  • TP301.6
[1] CAI R C, ZHANG Z J, HAO Z F. Sada:a general framework to support robust causation discovery[C]//International Conference on Machine Learning. Atlanta, GA, USA:ICML, 2013, 28:208-216.<br /> [2] XIE X C, GENG Z. A recursive method for structural learning of directed acyclic graphs[J]. Journal of Machine Learning Research, 2008, 9(3):459-483<br /> [3] GENG Z, WANG C, ZHAO Q. Decomposition of search for v-structures in dags[J]. Journal of Multivariate Analysis, 2005, 96(2):282-294<br /> [4] XIE X C, GENG Z, ZHAO Q. Decomposition of structural learning about directed acyclic graphs[J]. Artificial Intelligence, 2006, 170(4-5):422-439<br /> [5] LIU H, ZHOU S G, LAM W, <i>et al</i>. A new hybrid method for learning bayesian networks:separation and reunion[J]. Knowledge-Based Systems, 2017, 121:185-197<br /> [6] PEARL J. Causality[M]. Cambridge:Cambridge University Press, 2009:639-648.<br /> [7] KLOKS T, KRATSCH D. Finding all minimal separators of a graph[J]. Lecture Notes in Computer Science, 1993, 842(3):759-768<br /> [8] ZHANG K, PETERS J, JANZING D, <i>et al</i>. Kernel-based conditional independence test and application in causal discovery[J]. Computer Science, 2012, 06(8):895-907<br /> [9] EDWARDS D. Introduction to graphical modelling[M]. New York:Springer, 2000:235-254.<br /> [10] FUKUMIZU K, GRETTON A, SUN X, <i>et al</i>. Kernel measures of conditional dependence[J]. Advances in Neural Information Processing Systems, 2007, 20(1):167-204<br /> [11] ZHANG H, ZHOU S G, ZHANG K, et al. Causal discovery using regression-based conditional independence tests[C]//AAAI Conference on Artificial Intelligence. San Francisco, California, USA:AAAI, 2017:1250-1256.<br /> [12] ZHANG K, PETERS J, JANZING D, et al. Kernel-based conditional independence test and application in causal discovery[M]. USA:AUAI Press, 2011:804-813.<br /> [13] ZHANG K, WANG Z K, ZHANG J J, et al. On estimation of functional causal models:general results and application to post-nonlinear causal model[C]//ACM Transactions on Intelligent Systems and Technologies. New York, NY, USA:ACM, 2015, 7(2):1-22.<br /> [14] CAI R C, ZHANG Z J, HAO Z F. Bassum:a bayesian semi-supervised method for classification feature selection[J]. Pattern Recognition, 2011, 44(4):811-820<br /> [15] CAI R C, ZHANG Z J, HAO Z F. Causal gene identification using combinatorial v-structure search[J]. Neural Networks, 2013, 43:63-71<br /> [16] MOOIJ J M, PETERS J, JANZING D, <i>et al</i>. Distinguishing cause from effect using observational data:methods and benchmarks[J]. The Journal of Machine Learning Research, 2016, 17(1):1103-1204<br /> [17] HOYER P O, JANZING D, MOOIJ J M, <i>et al</i>. Nonlinear causal discovery with additive noise models[J]. Advances in neural Information Processing Systems, 2009:689-696<br /> [18] PETERS J, JANZING D, SCHOLKOPF B. Causal inference on discrete data using additive noise models[C]//IEEE Transactions on Pattern Analysis & Machine Intelligence. Washington DC:IEEE Computer Society, 2011:2436-2450.<br /> [19] SHIMIZU S, HOYER P. O, HYVARINEN A, <i>et al</i> A linear non-Gaussian acyclic model for causal discovery[J]. the Journal of Machine Learning Research, 2006, 7:2003-2030<br /> [20] JUDEA P. Probability reasoning in intelligent systems networks of plausible inference[J]. Computer Science Artificial Intelligence, 1988, 70(2):1022-1027<br /> [21] KOLLER D, FRIEDMAN N. Probabilistic graphical models:principles and techniques[M]. Cambridge:Massachusetts Institute of Technology Press, 2009:56-72.<br /> [22] BUDHATHOKI K, VREEKEN J. Causal inference by compression[C]//2016 IEEE 16th International Conference on Data Mining. Barcelona, Spain:IEEE, 2016:41-50.<br /> [23] JANZING D, STEUDEL B, SHAJARISALES N, <i>et al</i>. Justifying information-geometric causal inference[J]. Measures of Complexity, 2015:253-265<br /> [24] SCHOLKOPF B, JANZING D, PETERS J, et al. Semi-supervised learning in causal and anticausal settings[M]. Berlin Heidelberg:Springer, 2013:129-141.<br /> [25] SGOURITSA E, JANZING D, HENNIG P, <i>et al</i>. Inference of cause and effect with unsupervised inverse regression[J]. Artificial Intelligence and Statistics, 2015:847-855<br /> [26] SPIRTES P, GLYMOUR C N, SCHEINES R. Causation, prediction and search[M]. New York:Springer, 1993, 45(3):272-273.<br /> [27] BABA K, SHIBATA R, SIBUYA M. Partial correlation and conditional correlation as measures of conditional independence[J]. Australian and New Zealand Journal of Statistics, 2004, 46(4):657-664
[1] 刘冬宁, 王子奇, 曾艳姣, 文福燕, 王洋. 基于复合编码特征LSTM的基因甲基化位点预测方法[J]. 广东工业大学学报, 2023, 40(01): 1-9.
[2] 郝志峰, 黎伊婷, 蔡瑞初, 曾艳, 乔杰. 基于因果模型的社交网络用户购物行为研究[J]. 广东工业大学学报, 2020, 37(03): 1-8.
[3] 周怡璐, 王振友, 李叶紫, 李锋. MOEA/D聚合函数的二次泛化及其优化性能分析[J]. 广东工业大学学报, 2018, 35(04): 37-44.
[4] 黎启祥, 肖燕珊, 郝志峰, 阮奕邦. 基于抗噪声的多任务多示例学习算法研究[J]. 广东工业大学学报, 2018, 35(03): 47-53.
[5] 徐焕芬, 刘伟, 谢月珊. 双种群烟花算法[J]. 广东工业大学学报, 2017, 34(05): 65-72.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!