当前位置: X-MOL 学术Genom. Proteom. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Population Genetics of SARS-CoV-2: Disentangling Effects of Sampling Bias and Infection Clusters
Genomics, Proteomics & Bioinformatics ( IF 11.5 ) Pub Date : 2020-07-12 , DOI: 10.1016/j.gpb.2020.06.001
Qi Liu 1 , Shilei Zhao 1 , Cheng-Min Shi 2 , Shuhui Song 1 , Sihui Zhu 1 , Yankai Su 1 , Wenming Zhao 1 , Mingkun Li 3 , Yiming Bao 1 , Yongbiao Xue 1 , Hua Chen 3
Affiliation  

A novel RNA virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is responsible for the ongoing outbreak of coronavirus disease 2019 (COVID-19). Population genetic analysis could be useful for investigating the origin and evolutionary dynamics of COVID-19. However, due to extensive sampling bias and existence of infection clusters during the epidemic spread, direct applications of existing approaches can lead to biased parameter estimations and data misinterpretation. In this study, we first present robust estimator for the time to the most recent common ancestor (TMRCA) and the mutation rate, and then apply the approach to analyze 12,909 genomic sequences of SARS-CoV-2. The mutation rate is inferred to be 8.69 × 10−4 per site per year with a 95% confidence interval (CI) of [8.61 × 10−4, 8.77 × 10−4], and the TMRCA of the samples inferred to be Nov 28, 2019 with a 95% CI of [Oct 20, 2019, Dec 9, 2019]. The results indicate that COVID-19 might originate earlier than and outside of Wuhan Seafood Market. We further demonstrate that genetic polymorphism patterns, including the enrichment of specific haplotypes and the temporal allele frequency trajectories generated from infection clusters, are similar to those caused by evolutionary forces such as natural selection. Our results show that population genetic methods need to be developed to efficiently detangle the effects of sampling bias and infection clusters to gain insights into the evolutionary mechanism of SARS-CoV-2. Software for implementing VirusMuT can be downloaded at https://bigd.big.ac.cn/biocode/tools/BT007081.



中文翻译:


SARS-CoV-2 的群体遗传学:采样偏差和感染簇的解开影响



严重急性呼吸综合征冠状病毒 2 ( SARS-CoV-2 ) 是一种新型 RNA 病毒,它是导致 2019 年冠状病毒病 ( COVID-19 ) 持续爆发的原因。群体遗传分析可能有助于研究 COVID-19 的起源和进化动态。然而,由于广泛的抽样偏差和流行病传播过程中感染群的存在,直接应用现有方法可能会导致参数估计有偏差和数据误解。在本研究中,我们首先提出了最近共同祖先时间 (TMRCA) 和突变率的稳健估计器,然后应用该方法分析 SARS-CoV-2 的 12,909 个基因组序列。推断突变率为每年每个位点 8.69 × 10 -4 ,95% 置信区间 (CI) 为 [8.61 × 10 -4 , 8.77 × 10 -4 ],样本的 TMRCA 推断为 11 月2019年12月28日,95% CI 为[2019年10月20日,2019年12月9日]。结果表明,COVID-19 可能起源于武汉海鲜市场之前和之外。我们进一步证明遗传多态性模式,包括特定单倍型的富集和感染簇产生的时间等位基因频率轨迹,与自然选择等进化力量引起的相似。我们的结果表明,需要开发群体遗传学方法来有效地消除采样偏差和感染集群的影响,从而深入了解 SARS-CoV-2 的进化机制。实现VirusMuT的软件可以在https://bigd.big.ac.cn/biocode/tools/BT007081下载。

更新日期:2020-07-12
down
wechat
bug