当前位置: X-MOL 学术Comput. Struct. Biotechnol. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An automatic representation of peptides for effective antimicrobial activity classification
Computational and Structural Biotechnology Journal ( IF 4.4 ) Pub Date : 2020-02-26 , DOI: 10.1016/j.csbj.2020.02.002
Jesus A. Beltran , Gabriel Del Rio , Carlos A. Brizuela

Antimicrobial peptides (AMPs) are a promising alternative to small-molecules-based antibiotics. These peptides are part of most living organisms’ innate defense system. In order to computationally identify new AMPs within the peptides these organisms produce, an automatic AMP/non-AMP classifier is required. In order to have an efficient classifier, a set of robust features that can capture what differentiates an AMP from another that is not, has to be selected. However, the number of candidate descriptors is large (in the order of thousands) to allow for an exhaustive search of all possible combinations. Therefore, efficient and effective feature selection techniques are required.

In this work, we propose an efficient wrapper technique to solve the feature selection problem for AMPs identification. The method is based on a Genetic Algorithm that uses a variable-length chromosome for representing the selected features and uses an objective function that considers the Mathew Correlation Coefficient and the number of selected features. Computational experiments show that the proposed method can produce competitive results regarding sensitivity, specificity, and MCC. Furthermore, the best classification results are achieved by using only 39 out of 272 molecular descriptors.



中文翻译:

用于有效抗菌活性分类的肽的自动表示

抗菌肽(AMPs)是基于小分子抗生素的有前途的替代品。这些肽是大多数生物体固有防御系统的一部分。为了在计算机上识别这些生物产生的肽中的新AMP,需要自动AMP /非AMP分类器。为了具有有效的分类器,必须选择一组强大的功能,这些功能可以捕获将AMP与其他AMP区别开的特征。但是,候选描述符的数量很大(成千上万个),以允许详尽搜索所有可能的组合。因此,需要有效的和有效的特征选择技术。

在这项工作中,我们提出了一种有效的包装技术来解决用于AMP识别的特征选择问题。该方法基于遗传算法,该遗传算法使用可变长度染色体代表所选特征,并使用考虑了Mathew相关系数和所选特征数量的目标函数。计算实验表明,所提出的方法可以在灵敏度,特异性和MCC方面产生竞争性结果。此外,仅使用272个分子描述符中的39个可获得最佳分类结果。

更新日期:2020-02-26
down
wechat
bug