当前位置: X-MOL 学术Genome Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SeqScreen: accurate and sensitive functional screening of pathogenic sequences via ensemble learning
Genome Biology ( IF 10.1 ) Pub Date : 2022-06-20 , DOI: 10.1186/s13059-022-02695-x
Advait Balaji 1 , Bryce Kille 1 , Anthony D Kappell 2 , Gene D Godbold 3 , Madeline Diep 4 , R A Leo Elworth 1 , Zhiqin Qian 1 , Dreycey Albin 1 , Daniel J Nasko 5 , Nidhi Shah 5 , Mihai Pop 5 , Santiago Segarra 6 , Krista L Ternus 2 , Todd J Treangen 1
Affiliation  

The COVID-19 pandemic has emphasized the importance of accurate detection of known and emerging pathogens. However, robust characterization of pathogenic sequences remains an open challenge. To address this need we developed SeqScreen, which accurately characterizes short nucleotide sequences using taxonomic and functional labels and a customized set of curated Functions of Sequences of Concern (FunSoCs) specific to microbial pathogenesis. We show our ensemble machine learning model can label protein-coding sequences with FunSoCs with high recall and precision. SeqScreen is a step towards a novel paradigm of functionally informed synthetic DNA screening and pathogen characterization, available for download at www.gitlab.com/treangenlab/seqscreen .

中文翻译:


SeqScreen:通过集成学习对致病序列进行准确、灵敏的功能筛选



COVID-19 大流行强调了准确检测已知和新出现的病原体的重要性。然而,致病序列的可靠表征仍然是一个开放的挑战。为了满足这一需求,我们开发了 SeqScreen,它使用分类学和功能标签以及一组针对微生物发病机制的定制精选关注序列功能 (FunSoC) 来准确表征短核苷酸序列。我们展示了我们的集成机器学习模型可以使用 FunSoC 来标记蛋白质编码序列,并具有高召回率和精度。 SeqScreen 是迈向功能性合成 DNA 筛选和病原体表征新范例的一步,可从 www.gitlab.com/trangenlab/seqscreen 下载。
更新日期:2022-06-20
down
wechat
bug