当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reconstruction of Viral Variants via Monte Carlo Clustering.
Journal of Computational Biology ( IF 1.7 ) Pub Date : 2023-09-11 , DOI: 10.1089/cmb.2023.0154
Akshay Juyal 1 , Roya Hosseini 1 , Daniel Novikov 1 , Mark Grinshpon 2 , Alex Zelikovsky 1
Affiliation  

Identifying viral variants through clustering is essential for understanding the composition and structure of viral populations within and between hosts, which play a crucial role in disease progression and epidemic spread. This article proposes and validates novel Monte Carlo (MC) methods for clustering aligned viral sequences by minimizing either entropy or Hamming distance from consensuses. We validate these methods on four benchmarks: two SARS-CoV-2 interhost data sets and two HIV intrahost data sets. A parallelized version of our tool is scalable to very large data sets. We show that both entropy and Hamming distance-based MC clusterings discern the meaningful information from sequencing data. The proposed clustering methods consistently converge to similar clusterings across different runs. Finally, we show that MC clustering improves reconstruction of intrahost viral population from sequencing data.

中文翻译:

通过蒙特卡罗聚类重建病毒变体。

通过聚类识别病毒变异体对于了解宿主内部和宿主之间病毒群体的组成和结构至关重要,这在疾病进展和流行病传播中发挥着至关重要的作用。本文提出并验证了新的蒙特卡罗 (MC) 方法,通过最小化共识的熵或汉明距离来对对齐的病毒序列进行聚类。我们在四个基准上验证了这些方法:两个 SARS-CoV-2 宿主间数据集和两个 HIV 宿主内数据集。我们工具的并行版本可扩展到非常大的数据集。我们表明,基于熵和汉明距离的 MC 聚类都可以从测序数据中辨别出有意义的信息。所提出的聚类方法在不同的运行中一致地收敛到相似的聚类。最后,我们表明 MC 聚类改进了根据测序数据重建宿主内病毒群体。
更新日期:2023-09-11
down
wechat
bug