当前位置: X-MOL 学术Stat. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning genetic and environmental graphical models from family data.
Statistics in Medicine ( IF 2 ) Pub Date : 2020-04-28 , DOI: 10.1002/sim.8545
Adèle H Ribeiro 1 , Júlia Maria Pavan Soler 2
Affiliation  

Many challenging problems in biomedical research rely on understanding how variables are associated with each other and influenced by genetic and environmental factors. Probabilistic graphical models (PGMs) are widely acknowledged as a very natural and formal language to describe relationships among variables and have been extensively used for studying complex diseases and traits. In this work, we propose methods that leverage observational Gaussian family data for learning a decomposition of undirected and directed acyclic PGMs according to the influence of genetic and environmental factors. Many structure learning algorithms are strongly based on a conditional independence test. For independent measurements of normally distributed variables, conditional independence can be tested through standard tests for zero partial correlation. In family data, the assumption of independent measurements does not hold since related individuals are correlated due to mainly genetic factors. Based on univariate polygenic linear mixed models, we propose tests that account for the familial dependence structure and allow us to assess the significance of the partial correlation due to genetic (between‐family) factors and due to other factors, denoted here as environmental (within‐family) factors, separately. Then, we extend standard structure learning algorithms, including the IC/PC and the really fast causal inference (RFCI) algorithms, to Gaussian family data. The algorithms learn the most likely PGM and its decomposition into two components, one explained by genetic factors and the other by environmental factors. The proposed methods are evaluated by simulation studies and applied to the Genetic Analysis Workshop 13 simulated dataset, which captures significant features of the Framingham Heart Study.

中文翻译:

从家庭数据中学习遗传和环境图形模型。

生物医学研究中许多具有挑战性的问题都依赖于理解变量如何相互关联并受遗传和环境因素影响。概率图形模型(PGM)被公认为是描述变量之间关系的一种非常自然和形式化的语言,已被广泛用于研究复杂的疾病和特征。在这项工作中,我们提出了根据遗传和环境因素的影响,利用观测高斯家族数据来学习无向和有向无环PGM分解的方法。许多结构学习算法都强烈基于条件独立性测试。对于正态分布变量的独立测量,可以通过零偏相关的标准测试来测试条件独立性。在家庭数据中,独立测量的假设不成立,因为相关个体主要由于遗传因素而相互关联。在单变量多基因线性混合模型的基础上,我们提出了考虑家族依赖结构的测试,并允许我们评估由于遗传(家庭之间)因素和其他因素(此处称为环境(在内部))引起的部分相关的重要性。 -家庭)因素。然后,我们将标准结构学习算法(包括IC / PC和真正的快速因果推断(RFCI)算法)扩展到高斯族数据。该算法学习最有可能的PGM及其分解成两个部分,一个由遗传因素解释,另一个由环境因素解释。
更新日期:2020-07-02
down
wechat
bug