当前位置: X-MOL 学术Stat. Appl. Genet. Molecul. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bayesian reconstruction of transmission trees from genetic sequences and uncertain infection times
Statistical Applications in Genetics and Molecular Biology ( IF 0.8 ) Pub Date : 2020-12-01 , DOI: 10.1515/sagmb-2019-0026
Hesam Montazeri 1 , Susan Little 2 , Mozhgan Mozaffarilegha 1 , Niko Beerenwinkel 3, 4 , Victor DeGruttola 5
Affiliation  

Genetic sequence data of pathogens are increasingly used to investigate transmission dynamics in both endemic diseases and disease outbreaks. Such research can aid in the development of appropriate interventions and in the design of studies to evaluate them. Several computational methods have been proposed to infer transmission chains from sequence data; however, existing methods do not generally reliably reconstruct transmission trees because genetic sequence data or inferred phylogenetic trees from such data contain insufficient information for accurate estimation of transmission chains. Here, we show by simulation studies that incorporating infection times, even when they are uncertain, can greatly improve the accuracy of reconstruction of transmission trees. To achieve this improvement, we propose a Bayesian inference methods using Markov chain Monte Carlo that directly draws samples from the space of transmission trees under the assumption of complete sampling of the outbreak. The likelihood of each transmission tree is computed by a phylogenetic model by treating its internal nodes as transmission events. By a simulation study, we demonstrate that accuracy of the reconstructed transmission trees depends mainly on the amount of information available on times of infection; we show superiority of the proposed method to two alternative approaches when infection times are known up to specified degrees of certainty. In addition, we illustrate the use of a multiple imputation framework to study features of epidemic dynamics, such as the relationship between characteristics of nodes and average number of outbound edges or inbound edges, signifying possible transmission events from and to nodes. We apply the proposed method to a transmission cluster in San Diego and to a dataset from the 2014 Sierra Leone Ebola virus outbreak and investigate the impact of biological, behavioral, and demographic factors.

中文翻译:


根据基因序列和不确定的感染时间重建传播树的贝叶斯



病原体的基因序列数据越来越多地用于研究地方病和疾病暴发的传播动态。此类研究有助于制定适当的干预措施并设计评估干预措施的研究。已经提出了几种计算方法来从序列数据推断传输链;然而,现有方法通常不能可靠地重建传播树,因为基因序列数据或从此类数据推断的系统发育树包含的信息不足以准确估计传播链。在这里,我们通过模拟研究表明,即使感染时间不确定,也可以大大提高传播树重建的准确性。为了实现这一改进,我们提出了一种使用马尔可夫链蒙特卡罗的贝叶斯推理方法,在对疫情进行完整采样的假设下,直接从传播树的空间中抽取样本。每个传播树的可能性是通过系统发育模型将其内部节点视为传播事件来计算的。通过模拟研究,我们证明重建传播树的准确性主要取决于感染时间的可用信息量;当感染时间已知达到指定的确定程度时,我们证明了所提出的方法相对于两种替代方法的优越性。此外,我们还说明了使用多重插补框架来研究流行病动态特征,例如节点特征与出站边缘或入站边缘的平均数量之间的关系,表示节点可能发生的传播事件。 我们将所提出的方法应用于圣地亚哥的一个传播集群和 2014 年塞拉利昂埃博拉病毒爆发的数据集,并调查生物、行为和人口因素的影响。
更新日期:2020-12-01
down
wechat
bug