当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Epidemiological data analysis of viral quasispecies in the next-generation sequencing era.
Briefings in Bioinformatics ( IF 6.8 ) Pub Date : 2020-06-22 , DOI: 10.1093/bib/bbaa101
Sergey Knyazev 1 , Lauren Hughes 2 , Pavel Skums 3 , Alexander Zelikovsky 3
Affiliation  

Abstract
The unprecedented coverage offered by next-generation sequencing (NGS) technology has facilitated the assessment of the population complexity of intra-host RNA viral populations at an unprecedented level of detail. Consequently, analysis of NGS datasets could be used to extract and infer crucial epidemiological and biomedical information on the levels of both infected individuals and susceptible populations, thus enabling the development of more effective prevention strategies and antiviral therapeutics. Such information includes drug resistance, infection stage, transmission clusters and structures of transmission networks. However, NGS data require sophisticated analysis dealing with millions of error-prone short reads per patient. Prior to the NGS era, epidemiological and phylogenetic analyses were geared toward Sanger sequencing technology; now, they must be redesigned to handle the large-scale NGS datasets and properly model the evolution of heterogeneous rapidly mutating viral populations. Additionally, dedicated epidemiological surveillance systems require big data analytics to handle millions of reads obtained from thousands of patients for rapid outbreak investigation and management. We survey bioinformatics tools analyzing NGS data for (i) characterization of intra-host viral population complexity including single nucleotide variant and haplotype calling; (ii) downstream epidemiological analysis and inference of drug-resistant mutations, age of infection and linkage between patients; and (iii) data collection and analytics in surveillance systems for fast response and control of outbreaks.


中文翻译:


下一代测序时代病毒准种的流行病学数据分析。


 抽象的

新一代测序 (NGS) 技术提供的前所未有的覆盖范围,有助于以前所未有的详细程度评估宿主内 RNA 病毒种群的种群复杂性。因此,NGS 数据集的分析可用于提取和推断有关感染个体和易感人群水平的重要流行病学和生物医学信息,从而能够制定更有效的预防策略和抗病毒疗法。这些信息包括耐药性、感染阶段、传播集群和传播网络的结构。然而,NGS 数据需要复杂的分析,以处理每个患者数百万个容易出错的短读取。在NGS时代之前,流行病学和系统发育分析主要针对桑格测序技术;现在,它们必须重新设计,以处理大规模 NGS 数据集,并正确模拟异质快速突变病毒群体的进化。此外,专用的流行病学监测系统需要大数据分析来处理从数千名患者获得的数百万条读数,以便快速进行疫情调查和管理。我们调查了分析 NGS 数据的生物信息学工具,以(i)表征宿主内病毒群体的复杂性,包括单核苷酸变异和单倍型调用; (ii) 下游流行病学分析以及耐药突变、感染年龄和患者之间联系的推断; (iii) 监测系统中的数据收集和分析,以快速响应和控制疫情。
更新日期:2020-06-22
down
wechat
bug