当前位置: X-MOL 学术Neural Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A causal discovery algorithm based on the prior selection of leaf nodes.
Neural Networks ( IF 7.8 ) Pub Date : 2020-01-07 , DOI: 10.1016/j.neunet.2019.12.020
Yan Zeng 1 , Zhifeng Hao 2 , Ruichu Cai 1 , Feng Xie 1 , Liang Ou 3 , Ruihui Huang 4
Affiliation  

In recent years, Linear Non-Gaussian Acyclic Model (LiNGAM) has been widely used for the discovery of causal network. However, solutions based on LiNGAM usually yield high computational complexity as well as unsatisfied accuracy when the data is high-dimensional or the sample size is too small. Such complexity or accuracy problems here are often originated from their prior selection of root nodes when estimating a causal ordering. Thus, a causal discovery algorithm termed as GPL algorithm (the LiNGAM algorithm of Giving Priority to Leaf-nodes) under a mild assumption is proposed in this paper. It assigns priority to leaf nodes other than root nodes. Since leaf nodes do not affect others in a structure, we can directly estimate a causal ordering in a bottom-up way without performing additional operations like data updating process. Corresponding proofs for both feasibility and superiority are offered based on the properties of leaf nodes. Aside from theoretical analyses, practical experiments are conducted on both synthetic and real-world data, which confirm that GPL algorithm outperforms the other two state-of-the-art algorithms in computational complexity and accuracy, especially when dealing with high-dimensional data (up to 200) or small sample size (down to 100 for the dimension of 70).

中文翻译:

基于叶节点先验选择的因果发现算法。

近年来,线性非高斯非循环模型(LiNGAM)已被广泛用于发现因果网络。但是,基于LiNGAM的解决方案通常会在数据为高维或样本大小太小的情况下产生较高的计算复杂度以及不令人满意的精度。这里的这种复杂性或准确性问题通常是由于它们在估计因果顺序时事先选择的根节点而引起的。因此,本文提出了一种在适度假设下的因果发现算法,称为GPL算法(给予叶节点优先的LiNGAM算法)。它为除根节点以外的叶节点分配优先级。由于叶节点不会影响结构中的其他节点,因此我们可以以自底向上的方式直接估计因果顺序,而无需执行数据更新过程之类的其他操作。根据叶节点的属性,为可行性和优越性提供了相应的证明。除了理论分析外,我们还对合成数据和实际数据进行了实际实验,这证实了GPL算法在计算复杂性和准确性方面优于其他两种最新算法,尤其是在处理高维数据时(最多200个样本)或较小的样本量(对于70个样本,最小样本量为100个样本)。
更新日期:2020-01-07
down
wechat
bug