当前位置: X-MOL 学术Genes › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improved Large-Scale Homology Search by Two-Step Seed Search Using Multiple Reduced Amino Acid Alphabets
Genes ( IF 3.5 ) Pub Date : 2021-09-21 , DOI: 10.3390/genes12091455
Kazuki Takabatake 1 , Kazuki Izawa 1 , Motohiro Akikawa 1 , Keisuke Yanagisawa 1 , Masahito Ohue 1 , Yutaka Akiyama 1
Affiliation  

Metagenomic analysis, a technique used to comprehensively analyze microorganisms present in the environment, requires performing high-precision homology searches on large amounts of sequencing data, the size of which has increased dramatically with the development of next-generation sequencing. NCBI BLAST is the most widely used software for performing homology searches, but its speed is insufficient for the throughput of current DNA sequencers. In this paper, we propose a new, high-performance homology search algorithm that employs a two-step seed search strategy using multiple reduced amino acid alphabets to identify highly similar subsequences. Additionally, we evaluated the validity of the proposed method against several existing tools. Our method was faster than any other existing program for ≤120,000 queries, while DIAMOND, an existing tool, was the fastest method for >120,000 queries.

中文翻译:

通过使用多个还原氨基酸字母的两步种子搜索改进大规模同源搜索

宏基因组分析是一种用于综合分析环境中存在的微生物的技术,需要对大量测序数据进行高精度同源性搜索,随着下一代测序的发展,其规模急剧增加。NCBI BLAST 是执行同源性搜索最广泛使用的软件,但其速度不足以满足当前 DNA 测序仪的吞吐量。在本文中,我们提出了一种新的高性能同源搜索算法,该算法采用两步种子搜索策略,使用多个简化的氨基酸字母表来识别高度相似的子序列。此外,我们针对几种现有工具评估了所提出方法的有效性。对于≤120,000 个查询,我们的方法比任何其他现有程序都要快,而现有工具 DIAMOND,
更新日期:2021-09-21
down
wechat
bug