当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Characterizing geographical and temporal dynamics of novel coronavirus SARS-CoV-2 using informative subtype markers
bioRxiv - Genomics Pub Date : 2020-07-10 , DOI: 10.1101/2020.04.07.030759
Zhengqiao Zhao , Bahrad A. Sokhansanj , Charvi Malhotra , Kitty Zheng , Gail L. Rosen

We propose an efficient framework for genetic subtyping of SARS-CoV-2, the novel coronavirus that causes the COVID-19 pandemic. Efficient viral subtyping enables visualization and modeling of the geographic distribution and temporal dynamics of disease spread. Subtyping thereby advances the development of effective containment strategies and, potentially, therapeutic and vaccine strategies. However, identifying viral subtypes in real-time is challenging: SARS-CoV-2 is a novel virus, and the pandemic is rapidly expanding. Viral subtypes may be difficult to detect due to rapid evolution; founder effects are more significant than selection pressure; and the clustering threshold for subtyping is not standardized. We propose to identify mutational signatures of available SARS-CoV-2 sequences using a population-based approach: an entropy measure followed by frequency analysis. These signatures, Informative Subtype Markers (ISMs), define a compact set of nucleotide sites that characterize the most variable (and thus most informative) positions in the viral genomes sequenced from different individuals. Through ISM compression, we find that certain distant nucleotide variants covary, including non-coding and ORF1ab sites covarying with the D614G spike protein mutation which has become increasingly prevalent as the pandemic has spread. ISMs are also useful for downstream analyses, such as spatiotemporal visualization of viral dynamics. By analyzing sequence data available in the GISAID database, we validate the utility of ISM-based subtyping by comparing spatiotemporal analyses using ISMs to epidemiological studies of viral transmission in Asia, Europe, and the United States. In addition, we show the relationship of ISMs to phylogenetic reconstructions of SARS-CoV-2 evolution, and therefore, ISMs can play an important complementary role to phylogenetic tree-based analysis, such as is done in the Nextstrain project. The developed pipeline dynamically generates ISMs for newly added SARS-CoV-2 sequences and updates the visualization of pandemic spatiotemporal dynamics, and is available on Github at https://github.com/EESI/ISM and via an interactive website at https://covid19-ism.coe.drexel.edu/.

中文翻译:

使用信息性亚型标志物表征新型冠状病毒SARS-CoV-2的地理和时间动态

我们为SARS-CoV-2(一种导致COVID-19大流行的新型冠状病毒)的基因分型提出了一个有效的框架。高效的病毒亚型可实现疾病传播的地理分布和时间动态的可视化和建模。因此,亚型化促进了有效遏制策略以及潜在的治疗和疫苗策略的发展。但是,实时识别病毒亚型具有挑战性:SARS-CoV-2是一种新型病毒,大流行正在迅速扩大。病毒亚型可能由于快速进化而难以检测。创建者的影响比选择压力更重要;子类型的聚类阈值尚未标准化。我们建议使用基于人群的方法来识别可用SARS-CoV-2序列的突变特征:熵测度,然后进行频率分析。这些标记称为信息亚型标记(ISM),定义了一组紧凑的核苷酸位点,这些位点表征了从不同个体测序的病毒基因组中变化最大(因而信息最多的位置)的特征。通过ISM压缩,我们发现某些遥远的核苷酸变异体,包括与D614G峰值蛋白突变共变的非编码位点和ORF1ab位点,随着大流行的扩散,该突变变得越来越普遍。ISM还可以用于下游分析,例如病毒动力学的时空可视化。通过分析GISAID数据库中可用的序列数据,我们通过比较使用ISM的时空分析与亚洲,欧洲,和美国。此外,我们显示了ISM与SARS-CoV-2进化的系统进化重建之间的关系,因此,ISM可以在基于系统进化树的分析中发挥重要的补充作用,例如Nextstrain项目中所做的。开发的管道可为新添加的SARS-CoV-2序列动态生成ISM,并更新大流行时空动态的可视化效果,可在Github上的https://github.com/EESI/ISM或通过交互式网站https://获得。 /covid19-ism.coe.drexel.edu/。
更新日期:2020-07-10
down
wechat
bug