当前位置: X-MOL 学术bioRxiv. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SAINT: automatic taxonomy embedding and categorization by Siamese triplet network
bioRxiv - Bioinformatics Pub Date : 2021-01-21 , DOI: 10.1101/2021.01.20.426920
Yang Young Lu , Yiwen Wang , Fang Zhang , Jiaxing Bai , Ying Wang

Understanding the phylogenetic relationship among organisms is the key in contemporary evolutionary study and sequence analysis is the workhorse towards this goal. Conventional approaches to sequence analysis are based on sequence alignment, which is neither scalable to large-scale datasets due to computational inefficiency nor adaptive to next-generation sequencing (NGS) data. Alignment-free approaches are typically used as computationally effective alternatives yet still suffering the high demand of memory consumption. One desirable sequence comparison method at large-scale requires succinctly-organized sequence data management, as well as prompt sequence retrieval given a never-before-seen sequence as query.

中文翻译:

SAINT:通过连体三元组网络自动分类和嵌入分类

了解生物之间的系统发育关系是当代进化研究的关键,而序列分析是实现这一目标的主力军。常规的序列分析方法基于序列比对,该序列比对由于计算效率低而不能扩展到大规模数据集,也不能适应下一代测序(NGS)数据。无对齐方法通常被用作计算有效的替代方法,但仍然遭受着很高的内存消耗需求。大规模的一种理想的序列比较方法需要简洁组织的序列数据管理,以及在给定前所未有的序列作为查询的情况下进行迅速的序列检索。
更新日期:2021-01-22
down
wechat
bug