当前位置: X-MOL 学术Genet. Epidemiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Genome-wide association analysis of COVID-19 mortality risk in SARS-CoV-2 genomes identifies mutation in the SARS-CoV-2 spike protein that colocalizes with P.1 of the Brazilian strain
Genetic Epidemiology ( IF 2.1 ) Pub Date : 2021-06-22 , DOI: 10.1002/gepi.22421
Georg Hahn 1 , Chloe M Wu 2 , Sanghun Lee 1, 3 , Sharon M Lutz 1, 4 , Surender Khurana 5 , Lindsey R Baden 6 , Sebastien Haneuse 1 , Dandi Qiao 7, 8 , Julian Hecker 4, 7 , Dawn L DeMeo 7, 8 , Rudolph E Tanzi 9 , Manish C Choudhary 7 , Behzad Etemad 7 , Abbas Mohammadi 7 , Elmira Esmaeilzadeh 7 , Michael H Cho 7, 8 , Jonathan Z Li 7 , Adrienne G Randolph 7, 10 , Nan M Laird 1 , Scott T Weiss 7, 8 , Edwin K Silverman 7, 8 , Katharina Ribbeck 2 , Christoph Lange 1, 7, 8
Affiliation  

SARS-CoV-2 mortality has been extensively studied in relation to host susceptibility. How sequence variations in the SARS-CoV-2 genome affect pathogenicity is poorly understood. Starting in October 2020, using the methodology of genome-wide association studies (GWAS), we looked at the association between whole-genome sequencing (WGS) data of the virus and COVID-19 mortality as a potential method of early identification of highly pathogenic strains to target for containment. Although continuously updating our analysis, in December 2020, we analyzed 7548 single-stranded SARS-CoV-2 genomes of COVID-19 patients in the GISAID database and associated variants with mortality using a logistic regression. In total, evaluating 29,891 sequenced loci of the viral genome for association with patient/host mortality, two loci, at 12,053 and 25,088 bp, achieved genome-wide significance (p values of 4.09e−09 and 4.41e−23, respectively), though only 25,088 bp remained significant in follow-up analyses. Our association findings were exclusively driven by the samples that were submitted from Brazil (p value of 4.90e−13 for 25,088 bp). The mutation frequency of 25,088 bp in the Brazilian samples on GISAID has rapidly increased from about 0.4 in October/December 2020 to 0.77 in March 2021. Although GWAS methodology is suitable for samples in which mutation frequencies varies between geographical regions, it cannot account for mutation frequencies that change rapidly overtime, rendering a GWAS follow-up analysis of the GISAID samples that have been submitted after December 2020 as invalid. The locus at 25,088 bp is located in the P.1 strain, which later (April 2021) became one of the distinguishing loci (precisely, substitution V1176F) of the Brazilian strain as defined by the Centers for Disease Control. Specifically, the mutations at 25,088 bp occur in the S2 subunit of the SARS-CoV-2 spike protein, which plays a key role in viral entry of target host cells. Since the mutations alter amino acid coding sequences, they potentially imposing structural changes that could enhance viral infectivity and symptom severity. Our analysis suggests that GWAS methodology can provide suitable analysis tools for the real-time detection of new more transmissible and pathogenic viral strains in databases such as GISAID, though new approaches are needed to accommodate rapidly changing mutation frequencies over time, in the presence of simultaneously changing case/control ratios. Improvements of the associated metadata/patient information in terms of quality and availability will also be important to fully utilize the potential of GWAS methodology in this field.

中文翻译:

对 SARS-CoV-2 基因组中 COVID-19 死亡风险的全基因组关联分析确定了与巴西菌株 P.1 共定位的 SARS-CoV-2 刺突蛋白的突变

SARS-CoV-2 死亡率已被广泛研究与宿主易感性有关。对 SARS-CoV-2 基因组中的序列变异如何影响致病性知之甚少。从 2020 年 10 月开始,我们使用全基因组关联研究 (GWAS) 方法,将病毒的全基因组测序 (WGS) 数据与 COVID-19 死亡率之间的关联作为早期识别高致病性的潜在方法菌株作为遏制目标。尽管不断更新我们的分析,但在 2020 年 12 月,我们分析了 GISAID 数据库中 COVID-19 患者的 7548 个单链 SARS-CoV-2 基因组,并使用逻辑回归分析了与死亡率相关的变异。总共评估了 29,891 个病毒基因组测序位点与患者/宿主死亡率的关联,两个位点分别位于 12,053 和 25,088 bp,p值分别为 4.09e-09 和 4.41e-23),尽管在后续分析中只有 25,088 bp 仍然显着。我们的关联调查结果完全由从巴西提交的样本驱动(p25,088 bp 的值为 4.90e-13)。GISAID 上巴西样本中 25,088 bp 的突变频率已从 2020 年 10 月/2020 年 12 月的约 0.4 迅速增加到 2021 年 3 月的 0.77。尽管 GWAS 方法适用于突变频率因地理区域而异的样本,但它不能解释突变频率随着时间的推移迅速变化,导致对 2020 年 12 月之后提交的 GISAID 样本的 GWAS 后续分析无效。25,088 bp 的基因座位于 P.1 菌株中,该菌株后来(2021 年 4 月)成为疾病控制中心定义的巴西菌株的显着基因座之一(准确地说,替代 V1176F)。具体来说,25,088 bp 的突变发生在 SARS-CoV-2 刺突蛋白的 S2 亚基中,它在病毒进入靶宿主细胞中起关键作用。由于这些突变改变了氨基酸编码序列,它们可能会造成结构变化,从而增强病毒的传染性和症状的严重程度。我们的分析表明,GWAS 方法可以提供合适的分析工具,用于在 GISAID 等数据库中实时检测新的更具传播性和致病性的病毒株,尽管需要新的方法来适应随时间快速变化的突变频率,同时存在改变病例/控制比率。在质量和可用性方面改进相关元数据/患者信息对于充分利用 GWAS 方法在该领域的潜力也很重要。它们可能会带来结构上的变化,从而增强病毒的传染性和症状的严重程度。我们的分析表明,GWAS 方法可以提供合适的分析工具,用于在 GISAID 等数据库中实时检测新的更具传播性和致病性的病毒株,尽管需要新的方法来适应随时间快速变化的突变频率,同时存在改变病例/控制比率。在质量和可用性方面改进相关元数据/患者信息对于充分利用 GWAS 方法在该领域的潜力也很重要。它们可能会带来结构上的变化,从而增强病毒的传染性和症状的严重程度。我们的分析表明,GWAS 方法可以提供合适的分析工具,用于在 GISAID 等数据库中实时检测新的更具传播性和致病性的病毒株,尽管需要新的方法来适应随时间快速变化的突变频率,同时存在改变病例/控制比率。在质量和可用性方面改进相关元数据/患者信息对于充分利用 GWAS 方法在该领域的潜力也很重要。我们的分析表明,GWAS 方法可以提供合适的分析工具,用于在 GISAID 等数据库中实时检测新的更具传播性和致病性的病毒株,尽管需要新的方法来适应随时间快速变化的突变频率,同时存在改变病例/控制比率。在质量和可用性方面改进相关元数据/患者信息对于充分利用 GWAS 方法在该领域的潜力也很重要。我们的分析表明,GWAS 方法可以提供合适的分析工具,用于在 GISAID 等数据库中实时检测新的更具传播性和致病性的病毒株,尽管需要新的方法来适应随时间快速变化的突变频率,同时存在改变病例/控制比率。在质量和可用性方面改进相关元数据/患者信息对于充分利用 GWAS 方法在该领域的潜力也很重要。
更新日期:2021-06-22
down
wechat
bug