当前位置: X-MOL 学术bioRxiv. Evol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Unsupervised explainable AI for simultaneous molecular evolutionary study of forty thousand SARS-CoV-2 genomes
bioRxiv - Evolutionary Biology Pub Date : 2020-10-12 , DOI: 10.1101/2020.10.11.335406
Toshimichi Ikemura , Kennosuke Wada , Yoshiko Wada , Yuki Iwasaki , Takashi Abe

Unsupervised AI (artificial intelligence) can obtain novel knowledge from big data without particular models or prior knowledge and is highly desirable for unveiling hidden features in big data. SARS-CoV-2 poses a serious threat to public health and one important issue in characterizing this fast-evolving virus is to elucidate various aspects of their genome sequence changes. We previously established unsupervised AI, a BLSOM (batch-learning SOM), which can analyze five million genomic sequences simultaneously. The present study applied the BLSOM to the oligonucleotide compositions of forty thousand SARS-CoV-2 genomes. While only the oligonucleotide composition was given, the obtained clusters of genomes corresponded primarily to known main clades and internal divisions in the main clades. Since the BLSOM is explainable AI, it reveals which features of the oligonucleotide composition are responsible for clade clustering. The BLSOM has powerful image display capabilities and enables efficient knowledge discovery about viral evolutionary processes.

中文翻译:

无监督的AI可用于同时研究4万个SARS-CoV-2基因组的分子进化

无监督的AI(人工智能)可以从大数据中获得新颖的知识,而无需特定的模型或先验知识,因此对于揭示大数据中的隐藏功能非常有用。SARS-CoV-2对公共卫生构成了严重威胁,表征这种快速发展的病毒的一个重要问题是阐明其基因组序列变化的各个方面。我们之前建立了无监督的AI,即BLSOM(分批学习SOM),可以同时分析500万个基因组序列。本研究将BLSOM应用于四万个SARS-CoV-2基因组的寡核苷酸组成。虽然仅给出寡核苷酸组成,但是获得的基因组簇主要对应于已知的主要进化枝和主要进化枝的内部分裂。由于BLSOM是可以解释的AI,它揭示了寡核苷酸组合物的哪些特征导致进化枝聚类。BLSOM具有强大的图像显示功能,并且可以有效地发现有关病毒进化过程的知识。
更新日期:2020-10-13
down
wechat
bug