基于增强条件独立性检验的鲁棒因果发现算法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP18

基金项目:

新一代人工智能国家科技重大专项(2021ZD0111501); 国家优秀青年科学基金 (62122022)


Robust Causal Discovery Algorithm Based on Enhanced Conditional Independence Tests
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    因果关系发现旨在从观测数据中发现变量间的因果关系, 是帮助我们理解自然界、社会和技术系统中各种现象和变化的重要方法. 一种主流的因果发现方法是基于约束的算法, 这类算法通过检验变量间的条件独立性关系来确定变量之间的因果结构. 然而, 现实世界的数据收集往往受资源或技术的限制, 面临样本量有限, 节点方差大等挑战. 在这些场景下, 条件独立性检验的正确率受到极大影响, 导致学到的因果图中部分变量的因果边被错误地删除, 影响了算法输出的准确性. 为此, 提出一种增强的条件独立性检验的方法, 该方法的核心在于尽可能减少无关外部噪声对于待测试变量的干扰, 从而提高条件独立性检验结果的准确性. 基于该增强的条件独立性检验方法, 提出一种基于启发式搜索的结构学习算法, 该算法在初始结构图的基础上, 迭代搜索被误删的因果边, 基于增强的条件独立性检验并结合得分优化的思想, 重构因果结构. 实验结果显示, 相较于现有方法, 所提算法在仿真数据、贝叶斯网络数据以及真实数据上的F1值和结构汉明距离(SHD)均有显著提升, 证明在有限样本和因果结构中存在高方差节点的条件下更准确地揭示观测数据中潜在的真实因果结构的能力.

    Abstract:

    Causal discovery aims to uncover causal relationships among variables from observational data, serving as a crucial method for understanding various phenomena and changes in natural, social, and technological systems. A mainstream approach for causal discovery is a constraint-based algorithm, which determines the causal structure among variables by examining their conditional independence. However, data collection in the real world often faces challenges such as limited sample sizes and high variance among nodes due to resource or technical constraints. In these scenarios, the accuracy of conditional independence tests is greatly affected, leading to erroneous deletion of causal edges of some variables in learned causal graphs, thereby impacting the accuracy of the algorithm’s output. To address this issue, this study proposes an enhanced method for conditional independence testing, which focuses on minimizing the interference of irrelevant external noise on the variables being tested, thereby improving the accuracy of conditional independence tests. Based on this enhanced method, the paper introduces a structure learning algorithm based on heuristic search, which iteratively searches for mistakenly deleted causal edges on a graph with an initial structure. This algorithm reconstructs the causal structure by combining enhanced conditional independence tests with score optimization. Experimental results show that, compared to existing methods, the proposed algorithm significantly improves both the F1 score and the structural Hamming distance (SHD) on simulated, Bayesian network, and real data, demonstrating its ability to more accurately reveal underlying causal structures in observational data with limited samples and high-variance nodes.

    参考文献
    相似文献
    引证文献
引用本文

郝志峰,汪菲霞,陈正鸣,乔杰,蔡瑞初.基于增强条件独立性检验的鲁棒因果发现算法.软件学报,,():1-19

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-04-08
  • 最后修改日期:2024-06-06
  • 录用日期:
  • 在线发布日期: 2024-12-25
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号