当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Q-raKtion: A Semiautomated KNIME Workflow for Bioactivity Data Points Curation
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2022-11-28 , DOI: 10.1021/acs.jcim.2c01199
Deborah Palazzotti 1 , Martina Fiorelli 1 , Stefano Sabatini 1 , Serena Massari 1 , Maria Letizia Barreca 1 , Andrea Astolfi 1
Affiliation  

The recent increase of bioactivity data freely available to the scientific community and stored as activity data points in chemogenomic repositories provides a huge amount of ready-to-use information to support the development of predictive models. However, the benefits provided by the availability of such a vast amount of accessible information are strongly counteracted by the lack of uniformity and consistency of data from multiple sources, requiring a process of integration and harmonization. While different automated pipelines for processing and assessing chemical data have emerged in the last years, the curation of bioactivity data points is a less investigated topic, with useful concepts provided but no tangible tools available. In this context, the present work represents a first step toward the filling of this gap, by providing a tool to meet the needs of end-user in building proprietary high-quality data sets for further studies. Specifically, we herein describe Q-raKtion, a systematic, semiautomated, flexible, and, above all, customizable KNIME workflow that effectively aggregates information on biological activities of compounds retrieved by two of the most comprehensive and widely used repositories, PubChem and ChEMBL.

中文翻译:

Q-raKtion:用于生物活性数据点管理的半自动化 KNIME 工作流程

最近,科学界免费获得的生物活性数据不断增加,并作为活性数据点存储在化学基因组库中,提供了大量随时可用的信息来支持预测模型的开发。然而,由于来自多个来源的数据缺乏统一性和一致性,因此需要一个整合和协调的过程,从而大大抵消了如此大量的可访问信息的可用性所带来的好处。虽然在过去几年出现了用于处理和评估化学数据的不同自动化管道,但生物活性数据点的管理是一个研究较少的主题,提供了有用的概念但没有可用的有形工具。在这种情况下,目前的工作代表了填补这一空白的第一步,通过提供一种工具来满足最终用户在构建专有的高质量数据集以供进一步研究方面的需求。具体而言,我们在此描述了 Q-raKtion,这是一种系统的、半自动化的、灵活的,最重要的是,可定制的 KNIME 工作流程有效地聚合了两个最全面和使用最广泛的存储库 PubChem 和 ChEMBL 检索到的化合物的生物活性信息。
更新日期:2022-11-28
down
wechat
bug