当前位置: X-MOL 学术Biometrika › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
High-dimensional causal discovery under non-Gaussianity
Biometrika ( IF 2.4 ) Pub Date : 2019-10-25 , DOI: 10.1093/biomet/asz055
Y Samuel Wang 1 , Mathias Drton 2
Affiliation  

We consider graphical models based on a recursive system of linear structural equations. This implies that there is an ordering, $\sigma$, of the variables such that each observed variable $Y_v$ is a linear function of a variable specific error term and the other observed variables $Y_u$ with $\sigma(u) < \sigma (v)$. The causal relationships, i.e., which other variables the linear functions depend on, can be described using a directed graph. It has been previously shown that when the variable specific error terms are non-Gaussian, the exact causal graph, as opposed to a Markov equivalence class, can be consistently estimated from observational data. We propose an algorithm that yields consistent estimates of the graph also in high-dimensional settings in which the number of variables may grow at a faster rate than the number of observations, but in which the underlying causal structure features suitable sparsity; specifically, the maximum in-degree of the graph is controlled. Our theoretical analysis is couched in the setting of log-concave error distributions.

中文翻译:

非高斯下的高维因果发现

我们考虑基于线性结构方程递归系统的图形模型。这意味着变量的排序 $\sigma$ 使得每个观察变量 $Y_v$ 是变量特定误差项的线性函数,其他观察变量 $Y_u$ 与 $\sigma(u) < \sigma (v)$。因果关系,即线性函数所依赖的其他变量,可以使用有向图来描述。之前已经表明,当变量特定误差项是非高斯时,精确的因果图,而不是马尔可夫等价类,可以从观测数据中一致地估计出来。我们提出了一种算法,该算法在高维设置中也能产生一致的图估计,其中变量的数量可能以比观察数量更快的速度增长,但其中潜在的因果结构具有适当的稀疏性;具体来说,控制图形的最大入度。我们的理论分析基于对数凹面误差分布的设置。
更新日期:2019-10-25
down
wechat
bug