当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2024-02-16 , DOI: 10.1186/s13321-024-00810-7
António J. Preto , Ana B. Caniceiro , Francisco Duarte , Hugo Fernandes , Lino Ferreira , Joana Mourão , Irina S. Moreira

Cell-penetrating peptides (CPPs) are short chains of amino acids that have shown remarkable potential to cross the cell membrane and deliver coupled therapeutic cargoes into cells. Designing and testing different CPPs to target specific cells or tissues is crucial to ensure high delivery efficiency and reduced toxicity. However, in vivo/in vitro testing of various CPPs can be both time-consuming and costly, which has led to interest in computational methodologies, such as Machine Learning (ML) approaches, as faster and cheaper methods for CPP design and uptake prediction. However, most ML models developed to date focus on classification rather than regression techniques, because of the lack of informative quantitative uptake values. To address these challenges, we developed POSEIDON, an open-access and up-to-date curated database that provides experimental quantitative uptake values for over 2,300 entries and physicochemical properties of 1,315 peptides. POSEIDON also offers physicochemical properties, such as cell line, cargo, and sequence, among others. By leveraging this database along with cell line genomic features, we processed a dataset of over 1,200 entries to develop an ML regression CPP uptake predictor. Our results demonstrated that POSEIDON accurately predicted peptide cell line uptake, achieving a Pearson correlation of 0.87, Spearman correlation of 0.88, and r2 score of 0.76, on an independent test set. With its comprehensive and novel dataset, along with its potent predictive capabilities, the POSEIDON database and its associated ML predictor signify a significant leap forward in CPP research and development. The POSEIDON database and ML Predictor are available for free and with a user-friendly interface at https://moreiralab.com/resources/poseidon/ , making them valuable resources for advancing research on CPP-related topics. Scientific Contribution Statement: Our research addresses the critical need for more efficient and cost-effective methodologies in Cell-Penetrating Peptide (CPP) research. We introduced POSEIDON, a comprehensive and freely accessible database that delivers quantitative uptake values for over 2,300 entries, along with detailed physicochemical profiles for 1,315 peptides. Recognizing the limitations of current Machine Learning (ML) models for CPP design, our work leveraged the rich dataset provided by POSEIDON to develop a highly accurate ML regression model for predicting CPP uptake.

中文翻译:

POSEIDON:肽类对象与细胞域基于序列的相互作用:新的数据库和预测器

细胞穿透肽 (CPP) 是短链氨基酸,已显示出穿过细胞膜并将偶联的治疗物质递送到细胞中的巨大潜力。设计和测试针对特定细胞或组织的不同 CPP 对于确保高递送效率和降低毒性至关重要。然而,各种 CPP 的体内/体外测试既耗时又昂贵,这引起了人们对计算方法的兴趣,例如机器学习 (ML) 方法,作为 CPP 设计和摄取预测的更快、更便宜的方法。然而,由于缺乏信息丰富的定量吸收值,迄今为止开发的大多数机器学习模型都侧重于分类而不是回归技术。为了应对这些挑战,我们开发了 POSEIDON,这是一个开放访问且最新的精选数据库,提供 2,300 多个条目的实验定量摄取值和 1,315 种肽的理化特性。 POSEIDON 还提供理化特性,例如细胞系、货物和序列等。通过利用该数据库以及细胞系基因组特征,我们处理了包含 1,200 多个条目的数据集,以开发 ML 回归 CPP 摄取预测器。我们的结果表明,POSEIDON 准确预测了肽细胞系的摄取,在独立测试集上实现了 0.87 的 Pearson 相关性、0.88 的 Spearman 相关性和 0.76 的 r2 评分。凭借其全面而新颖的数据集以及强大的预测能力,POSEIDON 数据库及其相关的 ML 预测器标志着 CPP 研究和开发的重大飞跃。 POSEIDON 数据库和 ML Predictor 可免费获取,并具有用户友好的界面,网址为 https://moreiralab.com/resources/poseidon/,这使其成为推进 CPP 相关主题研究的宝贵资源。科学贡献声明:我们的研究解决了细胞穿透肽(CPP)研究中对更高效、更具成本效益的方法的迫切需求。我们推出了 POSEIDON,这是一个全面且可免费访问的数据库,可提供 2,300 多个条目的定量摄取值以及 1,315 种肽的详细理化概况。认识到当前 CPP 设计的机器学习 (ML) 模型的局限性,我们的工作利用 POSEIDON 提供的丰富数据集开发了一个高度准确的 ML 回归模型,用于预测 CPP 的采用情况。
更新日期:2024-02-17
down
wechat
bug