当前位置: X-MOL 学术Algorithms Mol. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
INGOT-DR: an interpretable classifier for predicting drug resistance in M. tuberculosis
Algorithms for Molecular Biology ( IF 1 ) Pub Date : 2021-08-10 , DOI: 10.1186/s13015-021-00198-1
Hooman Zabeti 1 , Nick Dexter 2 , Amir Hosein Safari 1 , Nafiseh Sedaghat 1 , Maxwell Libbrecht 1 , Leonid Chindelevitch 3
Affiliation  

Prediction of drug resistance and identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Solving this problem requires a transparent, accurate, and flexible predictive model. The methods currently used for this purpose rarely satisfy all of these criteria. On the one hand, approaches based on testing strains against a catalogue of previously identified mutations often yield poor predictive performance; on the other hand, machine learning techniques typically have higher predictive accuracy, but often lack interpretability and may learn patterns that produce accurate predictions for the wrong reasons. Current interpretable methods may either exhibit a lower accuracy or lack the flexibility needed to generalize them to previously unseen data. In this paper we propose a novel technique, inspired by group testing and Boolean compressed sensing, which yields highly accurate predictions, interpretable results, and is flexible enough to be optimized for various evaluation metrics at the same time. We test the predictive accuracy of our approach on five first-line and seven second-line antibiotics used for treating tuberculosis. We find that it has a higher or comparable accuracy to that of commonly used machine learning models, and is able to identify variants in genes with previously reported association to drug resistance. Our method is intrinsically interpretable, and can be customized for different evaluation metrics. Our implementation is available at github.com/hoomanzabeti/INGOT_DR and can be installed via The Python Package Index (Pypi) under ingotdr. This package is also compatible with most of the tools in the Scikit-learn machine learning library.

中文翻译:

INGOT-DR:用于预测结核分枝杆菌耐药性的可解释分类器

预测耐药性及其在细菌(如结核分枝杆菌)中的作用机制是一个具有挑战性的问题。解决这个问题需要一个透明、准确和灵活的预测模型。目前用于此目的的方法很少满足所有这些标准。一方面,基于针对先前确定的突变目录测试菌株的方法通常会产生较差的预测性能;另一方面,机器学习技术通常具有更高的预测准确性,但通常缺乏可解释性,并且可能会学习因错误原因产生准确预测的模式。当前的可解释方法可能要么表现出较低的准确性,要么缺乏将它们推广到以前看不见的数据所需的灵活性。在本文中,我们提出了一种新技术,受组测试和布尔压缩感知的启发,它可以产生高度准确的预测、可解释的结果,并且足够灵活,可以同时针对各种评估指标进行优化。我们测试了我们的方法对用于治疗结核病的五种一线和七种二线抗生素的预测准确性。我们发现它与常用的机器学习模型具有更高或相当的准确性,并且能够识别与先前报道的与耐药性相关的基因变异。我们的方法本质上是可解释的,并且可以针对不同的评估指标进行定制。我们的实现可在 github.com/hoomanzabeti/INGOT_DR 获得,并且可以通过 ingotdr 下的 Python 包索引 (Pypi) 安装。
更新日期:2021-08-10
down
wechat
bug