-
AMADAR: a python-based package for large scale prediction of Diels–Alder transition state geometries and IRC path analysis J. Cheminfom. (IF 5.514) Pub Date : 2022-06-15 Isamura, Bienfait K., Lobb, Kevin A.
Predicting transition state geometries is one of the most challenging tasks in computational chemistry, which often requires expert-based knowledge and permanent human intervention. This short communication reports technical details and preliminary results of a python-based tool (AMADAR) designed to generate any Diels–Alder (DA) transition state geometry (TS) and analyze determined IRC paths in a (quasi-)automated
-
Commentary: the first twelve years of the Journal of chemoinformatics J. Cheminfom. (IF 5.514) Pub Date : 2022-06-13 Willett, Peter
This commentary provides an overview of the publications in, and the citations to, the first twelve volumes of the Journal of Cheminformatics, covering the period 2009–2020. The analysis is based on the 622 articles that have appeared in the journal during that time and that have been indexed in the Clarivate Web of Science Core Collection database. It is clear that the journal has established itself
-
KNIME workflow for retrieving causal drug and protein interactions, building networks, and performing topological enrichment analysis demonstrated by a DILI case study J. Cheminfom. (IF 5.514) Pub Date : 2022-06-13 Füzi, Barbara, Malik-Sheriff, Rahuman S., Manners, Emma J., Hermjakob, Henning, Ecker, Gerhard F.
As an alternative to one drug-one target approaches, systems biology methods can provide a deeper insight into the holistic effects of drugs. Network-based approaches are tools of systems biology, that can represent valuable methods for visualizing and analysing drug-protein and protein–protein interactions. In this study, a KNIME workflow is presented which connects drugs to causal target proteins
-
DECIMER—hand-drawn molecule images dataset J. Cheminfom. (IF 5.514) Pub Date : 2022-06-09 Brinkhaus, Henning Otto, Zielesny, Achim, Steinbeck, Christoph, Rajan, Kohulan
The translation of images of chemical structures into machine-readable representations of the depicted molecules is known as optical chemical structure recognition (OCSR). There has been a lot of progress over the last three decades in this field, but the development of systems for the recognition of complex hand-drawn structure depictions is still at the beginning. Currently, there is no data for
-
Pharmacological affinity fingerprints derived from bioactivity data for the identification of designer drugs J. Cheminfom. (IF 5.514) Pub Date : 2022-06-07 He, Kedan
Facing the continuous emergence of new psychoactive substances (NPS) and their threat to public health, more effective methods for NPS prediction and identification are critical. In this study, the pharmacological affinity fingerprints (Ph-fp) of NPS compounds were predicted by Random Forest classification models using bioactivity data from the ChEMBL database. The binary Ph-fp is the vector consisting
-
PIKAChU: a Python-based informatics kit for analysing chemical units J. Cheminfom. (IF 5.514) Pub Date : 2022-06-07 Terlouw, Barbara R., Vromans, Sophie P. J. M., Medema, Marnix H.
As efforts to computationally describe and simulate the biochemical world become more commonplace, computer programs that are capable of in silico chemistry play an increasingly important role in biochemical research. While such programs exist, they are often dependency-heavy, difficult to navigate, or not written in Python, the programming language of choice for bioinformaticians. Here, we introduce
-
Probabilistic metabolite annotation using retention time prediction and meta-learned projections J. Cheminfom. (IF 5.514) Pub Date : 2022-06-07 García, Constantino A., Gil-de-la-Fuente, Alberto, Barbas, Coral, Otero, Abraham
Retention time information is used for metabolite annotation in metabolomic experiments. But its usefulness is hindered by the availability of experimental retention time data in metabolomic databases, and by the lack of reproducibility between different chromatographic methods. Accurate prediction of retention time for a given chromatographic method would be a valuable support for metabolite annotation
-
Analysis of the benefits of imputation models over traditional QSAR models for toxicity prediction J. Cheminfom. (IF 5.514) Pub Date : 2022-06-07 Walter, Moritz, Allen, Luke N., de la Vega de León, Antonio, Webb, Samuel J., Gillet, Valerie J.
Recently, imputation techniques have been adapted to predict activity values among sparse bioactivity matrices, showing improvements in predictive performance over traditional QSAR models. These models are able to use experimental activity values for auxiliary assays when predicting the activity of a test compound on a specific assay. In this study, we tested three different multi-task imputation techniques
-
RanDepict: Random chemical structure depiction generator J. Cheminfom. (IF 5.514) Pub Date : 2022-06-06 Brinkhaus, Henning Otto, Rajan, Kohulan, Zielesny, Achim, Steinbeck, Christoph
The development of deep learning-based optical chemical structure recognition (OCSR) systems has led to a need for datasets of chemical structure depictions. The diversity of the features in the training data is an important factor for the generation of deep learning systems that generalise well and are not overfit to a specific type of input. In the case of chemical structure depictions, these features
-
InflamNat: web-based database and predictor of anti-inflammatory natural products J. Cheminfom. (IF 5.514) Pub Date : 2022-06-04 Zhang, Ruihan, Ren, Shoupeng, Dai, Qi, Shen, Tianze, Li, Xiaoli, Li, Jin, Xiao, Weilie
Natural products (NPs) are a valuable source for anti-inflammatory drug discovery. However, they are limited by the unpredictability of the structures and functions. Therefore, computational and data-driven pre-evaluation could enable more efficient NP-inspired drug development. Since NPs possess structural features that differ from synthetic compounds, models trained with synthetic compounds may not
-
Chemical reaction network knowledge graphs: the OntoRXN ontology J. Cheminfom. (IF 5.514) Pub Date : 2022-05-30 Garay-Ruiz, Diego, Bo, Carles
The organization and management of large amounts of data has become a major point in almost all areas of human knowledge. In this context, semantic approaches propose a structure for the target data, defining ontologies that state the types of entities on a certain field and how these entities are interrelated. In this work, we introduce OntoRXN, a novel ontology describing the reaction networks constructed
-
canSAR chemistry registration and standardization pipeline J. Cheminfom. (IF 5.514) Pub Date : 2022-05-28 Dolciami, Daniela, Villasclaras-Fernandez, Eloy, Kannas, Christos, Meniconi, Mirco, Al-Lazikani, Bissan, Antolin, Albert A.
Integration of medicinal chemistry data from numerous public resources is an increasingly important part of academic drug discovery and translational research because it can bring a wealth of important knowledge related to compounds in one place. However, different data sources can report the same or related compounds in various forms (e.g., tautomers, racemates, etc.), thus highlighting the need of
-
Off-targetP ML: an open source machine learning framework for off-target panel safety assessment of small molecules J. Cheminfom. (IF 5.514) Pub Date : 2022-05-07 Naga, Doha, Muster, Wolfgang, Musvasva, Eunice, Ecker, Gerhard F.
Unpredicted drug safety issues constitute the majority of failures in the pharmaceutical industry according to several studies. Some of these preclinical safety issues could be attributed to the non-selective binding of compounds to targets other than their intended therapeutic target, causing undesired adverse events. Consequently, pharmaceutical companies routinely run in-vitro safety screens to
-
Efficient 3D conformer generation of cyclic peptides formed by a disulfide bond J. Cheminfom. (IF 5.514) Pub Date : 2022-05-03 Tao, Huanyu, Wu, Qilong, Zhao, Xuejun, Lin, Peicong, Huang, Sheng-You
Cyclic peptides formed by disulfide bonds have been one large group of common drug candidates in drug development. Structural information of a peptide is essential to understand its interaction with its target. However, due to the high flexibility of peptides, it is difficult to sample the near-native conformations of a peptide. Here, we have developed an extended version of our MODPEP approach, named
-
Diversifying cheminformatics J. Cheminfom. (IF 5.514) Pub Date : 2022-04-25 Zdrazil, Barbara, Guha, Rajarshi
With Dr. Barbara Zdrazil starting in her role as Co-Editor-in-Chief in January 2022, we revisit the scope of J. Cheminform, as well as the role of cheminformatics as a discipline. We present our vision for the Journal in moving the field of cheminformatics as well as Open Science forward in the coming years. By joining the fields of chemistry and information technology for solving chemical problems
-
Surge: a fast open-source chemical graph generator J. Cheminfom. (IF 5.514) Pub Date : 2022-04-23 McKay, Brendan D., Yirik, Mehmet Aziz, Steinbeck, Christoph
Chemical structure generators are used in cheminformatics to produce or enumerate virtual molecules based on a set of boundary conditions. The result can then be tested for properties of interest, such as adherence to measured data or for their suitability as drugs. The starting point can be a potentially fuzzy set of fragments or a molecular formula. In the latter case, the generator produces the
-
Machine learning to predict metabolic drug interactions related to cytochrome P450 isozymes J. Cheminfom. (IF 5.514) Pub Date : 2022-04-15 Wang, Ning-Ning, Wang, Xiang-Gui, Xiong, Guo-Li, Yang, Zi-Yi, Lu, Ai-Ping, Chen, Xiang, Liu, Shao, Hou, Ting-Jun, Cao, Dong-Sheng
Drug–drug interaction (DDI) often causes serious adverse reactions and thus results in inestimable economic and social loss. Currently, comprehensive DDI evaluation has become a major challenge in pharmaceutical research due to the time-consuming and costly process of the experimental assessment and it is of high necessity to develop effective in silico methods to predict and evaluate DDIs accurately
-
Galaxy workflows for fragment-based virtual screening: a case study on the SARS-CoV-2 main protease J. Cheminfom. (IF 5.514) Pub Date : 2022-04-12 Bray, Simon, Dudgeon, Tim, Skyner, Rachael, Backofen, Rolf, Grüning, Björn, von Delft, Frank
We present several workflows for protein-ligand docking and free energy calculation for use in the workflow management system Galaxy. The workflows are composed of several widely used open-source tools, including rDock and GROMACS, and can be executed on public infrastructure using either Galaxy’s graphical interface or the command line. We demonstrate the utility of the workflows by running a high-throughput
-
ChemInformatics Model Explorer (CIME): exploratory analysis of chemical model explanations J. Cheminfom. (IF 5.514) Pub Date : 2022-04-04 Humer, Christina, Heberle, Henry, Montanari, Floriane, Wolf, Thomas, Huber, Florian, Henderson, Ryan, Heinrich, Julian, Streit, Marc
The introduction of machine learning to small molecule research– an inherently multidisciplinary field in which chemists and data scientists combine their expertise and collaborate - has been vital to making screening processes more efficient. In recent years, numerous models that predict pharmacokinetic properties or bioactivity have been published, and these are used on a daily basis by chemists
-
Systemic evolutionary chemical space exploration for drug discovery J. Cheminfom. (IF 5.514) Pub Date : 2022-04-01 Lu, Chong, Liu, Shien, Shi, Weihua, Yu, Jun, Zhou, Zhou, Zhang, Xiaoxiao, Lu, Xiaoli, Cai, Faji, Xia, Ning, Wang, Yikai
Chemical space exploration is a major task of the hit-finding process during the pursuit of novel chemical entities. Compared with other screening technologies, computational de novo design has become a popular approach to overcome the limitation of current chemical libraries. Here, we reported a de novo design platform named systemic evolutionary chemical space explorer (SECSE). The platform was conceptually
-
Explaining and avoiding failure modes in goal-directed generation of small molecules J. Cheminfom. (IF 5.514) Pub Date : 2022-04-01 Langevin, Maxime, Vuilleumier, Rodolphe, Bianciotto, Marc
Despite growing interest and success in automated in-silico molecular design, questions remain regarding the ability of goal-directed generation algorithms to perform unbiased exploration of novel chemical spaces. A specific phenomenon has recently been highlighted: goal-directed generation guided with machine learning models produce molecules with high scores according to the optimization model, but
-
Transformer-based molecular optimization beyond matched molecular pairs J. Cheminfom. (IF 5.514) Pub Date : 2022-03-28 He, Jiazhen, Nittinger, Eva, Tyrchan, Christian, Czechtizky, Werngard, Patronov, Atanas, Bjerrum, Esben Jannik, Engkvist, Ola
Molecular optimization aims to improve the drug profile of a starting molecule. It is a fundamental problem in drug discovery but challenging due to (i) the requirement of simultaneous optimization of multiple properties and (ii) the large chemical space to explore. Recently, deep learning methods have been proposed to solve this task by mimicking the chemist’s intuition in terms of matched molecular
-
Decomposing compounds enables reconstruction of interaction fingerprints for structure-based drug screening J. Cheminfom. (IF 5.514) Pub Date : 2022-03-15 Adasme, Melissa F., Bolz, Sarah Naomi, Al-Fatlawi, Ali, Schroeder, Michael
Structure-based drug repositioning has emerged as a promising alternative to conventional drug development. Regardless of the many success stories reported over the past years and the novel breakthroughs on the AI-based system AlphaFold for structure prediction, the availability of structural data for protein–drug complexes remains very limited. Whereas the chemical libraries contain millions of drug
-
A multitask GNN-based interpretable model for discovery of selective JAK inhibitors J. Cheminfom. (IF 5.514) Pub Date : 2022-03-15 Wang, Yimeng, Gu, Yaxin, Lou, Chaofeng, Gong, Yuning, Wu, Zengrui, Li, Weihua, Tang, Yun, Liu, Guixia
The Janus kinase (JAK) family plays a pivotal role in most cytokine-mediated inflammatory and autoimmune responses via JAK/STAT signaling, and administration of JAK inhibitors is a promising therapeutic strategy for several diseases including COVID-19. However, to screen and design selective JAK inhibitors is a daunting task due to the extremely high homology among four JAK isoforms. In this study
-
Improving the performance of models for one-step retrosynthesis through re-ranking J. Cheminfom. (IF 5.514) Pub Date : 2022-03-15 Lin, Min Htoo, Tu, Zhengkai, Coley, Connor W.
Retrosynthesis is at the core of organic chemistry. Recently, the rapid growth of artificial intelligence (AI) has spurred a variety of novel machine learning approaches for data-driven synthesis planning. These methods learn complex patterns from reaction databases in order to predict, for a given product, sets of reactants that can be used to synthesise that product. However, their performance as
-
ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding J. Cheminfom. (IF 5.514) Pub Date : 2022-03-15 Wang, Junjie, Wen, NaiFeng, Wang, Chunyu, Zhao, Lingling, Cheng, Liang
Drug-target binding affinity (DTA) reflects the strength of the drug-target interaction; therefore, predicting the DTA can considerably benefit drug discovery by narrowing the search space and pruning drug-target (DT) pairs with low binding affinity scores. Representation learning using deep neural networks has achieved promising performance compared with traditional machine learning methods; hence
-
Correction to: TorsiFlex: an automatic generator of torsional conformers. Application to the twenty proteinogenic amino acids J. Cheminfom. (IF 5.514) Pub Date : 2022-03-14 Ferro‑Costas, David, Mosquera‑Lois, Irea, Fernandez‑Ramos, Antonio
Following publication of the original article [1], the have been notified by the authors that some corrections were missed. 1. The capitation of Fig. 6 has been corrected. 2. A few other minor corrections in the body of the text. The original article has been corrected. Ferro-Costas D, Mosquera-Lois I, Fernandez-Ramos A et al (2021) TorsiFlex: an automatic generator of torsional conformers. Application
-
Deep learning-driven prediction of drug mechanism of action from large-scale chemical-genetic interaction profiles J. Cheminfom. (IF 5.514) Pub Date : 2022-03-12 Liu, Chengyou, Hogan, Andrew M., Sturm, Hunter, Khan, Mohd Wasif, Islam, Md. Mohaiminul, Rahman, A. S. M. Zisanur, Davis, Rebecca, Cardona, Silvia T., Hu, Pingzhao
Chemical–genetic interaction profiling is a genetic approach that quantifies the susceptibility of a set of mutants depleted in specific gene product(s) to a set of chemical compounds. With the recent advances in artificial intelligence, chemical–genetic interaction profiles (CGIPs) can be leveraged to predict mechanism of action of compounds. This can be achieved by using machine learning, where the
-
Application of deep metric learning to molecular graph similarity J. Cheminfom. (IF 5.514) Pub Date : 2022-03-12 Coupry, Damien E., Pogány, Peter
Graph based methods are increasingly important in chemistry and drug discovery, with applications ranging from QSAR to molecular generation. Combining graph neural networks and deep metric learning concepts, we expose a framework for quantifying molecular graph similarity based on distance between learned embeddings separate from any endpoint. Using a minimal definition of similarity, and data from
-
MolData, a molecular benchmark for disease and target based machine learning J. Cheminfom. (IF 5.514) Pub Date : 2022-03-07 Keshavarzi Arshadi, Arash, Salem, Milad, Firouzbakht, Arash, Yuan, Jiann Shiun
Deep learning’s automatic feature extraction has been a revolutionary addition to computational drug discovery, infusing both the capabilities of learning abstract features and discovering complex molecular patterns via learning from molecular data. Since biological and chemical knowledge are necessary for overcoming the challenges of data curation, balancing, training, and evaluation, it is important
-
DeSIDE-DDI: interpretable prediction of drug-drug interactions using drug-induced gene expressions J. Cheminfom. (IF 5.514) Pub Date : 2022-03-04 Kim, Eunyoung, Nam, Hojung
Adverse drug-drug interaction (DDI) is a major concern to polypharmacy due to its unexpected adverse side effects and must be identified at an early stage of drug discovery and development. Many computational methods have been proposed for this purpose, but most require specific types of information, or they have less concern in interpretation on underlying genes. We propose a deep learning-based framework
-
PSnpBind: a database of mutated binding site protein–ligand complexes constructed using a multithreaded virtual screening workflow J. Cheminfom. (IF 5.514) Pub Date : 2022-02-28 Ammar, Ammar, Cavill, Rachel, Evelo, Chris, Willighagen, Egon
A key concept in drug design is how natural variants, especially the ones occurring in the binding site of drug targets, affect the inter-individual drug response and efficacy by altering binding affinity. These effects have been studied on very limited and small datasets while, ideally, a large dataset of binding affinity changes due to binding site single-nucleotide polymorphisms (SNPs) is needed
-
GloMPO (Globally Managed Parallel Optimization): a tool for expensive, black-box optimizations, application to ReaxFF reparameterizations J. Cheminfom. (IF 5.514) Pub Date : 2022-02-16 Freitas Gustavo, Michael, Verstraelen, Toon
In this work we explore the properties which make many real-life global optimization problems extremely difficult to handle, and some of the common techniques used in literature to address them. We then introduce a general optimization management tool called GloMPO (Globally Managed Parallel Optimization) to help address some of the challenges faced by practitioners. GloMPO manages and shares information
-
Reproducible untargeted metabolomics workflow for exhaustive MS2 data acquisition of MS1 features J. Cheminfom. (IF 5.514) Pub Date : 2022-02-16 Yu, Miao, Dolios, Georgia, Petrick, Lauren
Unknown features in untargeted metabolomics and non-targeted analysis (NTA) are identified using fragment ions from MS/MS spectra to predict the structures of the unknown compounds. The precursor ion selected for fragmentation is commonly performed using data dependent acquisition (DDA) strategies or following statistical analysis using targeted MS/MS approaches. However, the selected precursor ions
-
Sequence-based prediction of protein binding regions and drug–target interactions J. Cheminfom. (IF 5.514) Pub Date : 2022-02-08 Lee, Ingoo, Nam, Hojung
Identifying drug–target interactions (DTIs) is important for drug discovery. However, searching all drug–target spaces poses a major bottleneck. Therefore, recently many deep learning models have been proposed to address this problem. However, the developers of these deep learning models have neglected interpretability in model construction, which is closely related to a model’s performance. We hypothesized
-
Machine learning approaches to optimize small-molecule inhibitors for RNA targeting J. Cheminfom. (IF 5.514) Pub Date : 2022-02-02 Grimberg, Hadar, Tiwari, Vinay S., Tam, Benjamin, Gur-Arie, Lihi, Gingold, Daniela, Polachek, Lea, Akabayov, Barak
In the era of data science, data-driven algorithms have emerged as powerful platforms that can consolidate bioisosteric rules for preferential modifications on small molecules with a common molecular scaffold. Here we present complementary data-driven algorithms to minimize the search in chemical space for phenylthiazole-containing molecules that bind the RNA hairpin within the ribosomal peptidyl transferase
-
LEADD: Lamarckian evolutionary algorithm for de novo drug design J. Cheminfom. (IF 5.514) Pub Date : 2022-01-15 Kerstjens, Alan, De Winter, Hans
Given an objective function that predicts key properties of a molecule, goal-directed de novo molecular design is a useful tool to identify molecules that maximize or minimize said objective function. Nonetheless, a common drawback of these methods is that they tend to design synthetically unfeasible molecules. In this paper we describe a Lamarckian evolutionary algorithm for de novo drug design (LEADD)
-
Uncertainty-aware prediction of chemical reaction yields with graph neural networks J. Cheminfom. (IF 5.514) Pub Date : 2022-01-10 Kwon, Youngchun, Lee, Dongseon, Choi, Youn-Suk, Kang, Seokho
In this paper, we present a data-driven method for the uncertainty-aware prediction of chemical reaction yields. The reactants and products in a chemical reaction are represented as a set of molecular graphs. The predictive distribution of the yield is modeled as a graph neural network that directly processes a set of graphs with permutation invariance. Uncertainty-aware learning and inference are
-
HobPre: accurate prediction of human oral bioavailability for small molecules J. Cheminfom. (IF 5.514) Pub Date : 2022-01-06 Wei, Min, Zhang, Xudong, Pan, Xiaolin, Wang, Bo, Ji, Changge, Qi, Yifei, Zhang, John Z. H.
Human oral bioavailability (HOB) is a key factor in determining the fate of new drugs in clinical trials. HOB is conventionally measured using expensive and time-consuming experimental tests. The use of computational models to evaluate HOB before the synthesis of new drugs will be beneficial to the drug development process. In this study, a total of 1588 drug molecules with HOB data were collected
-
TorsiFlex: an automatic generator of torsional conformers. Application to the twenty proteinogenic amino acids J. Cheminfom. (IF 5.514) Pub Date : 2021-12-24 Ferro-Costas, David, Mosquera-Lois, Irea, Fernández-Ramos, Antonio
In this work, we introduce TorsiFlex, a user-friendly software written in Python 3 and designed to find all the torsional conformers of flexible acyclic molecules in an automatic fashion. For the mapping of the torsional potential energy surface, the algorithm implemented in TorsiFlex combines two searching strategies: preconditioned and stochastic. The former is a type of systematic search based on
-
Processing binding data using an open-source workflow J. Cheminfom. (IF 5.514) Pub Date : 2021-12-11 Samuel, Errol L. G., Holmes, Secondra L., Young, Damian W.
The thermal shift assay (TSA)—also known as differential scanning fluorimetry (DSF), thermofluor, and Tm shift—is one of the most popular biophysical screening techniques used in fragment-based ligand discovery (FBLD) to detect protein–ligand interactions. By comparing the thermal stability of a target protein in the presence and absence of a ligand, potential binders can be identified. The technique
-
Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms J. Cheminfom. (IF 5.514) Pub Date : 2021-12-11 Ye, Zhuyifan, Ouyang, Defang
Rapid solvent selection is of great significance in chemistry. However, solubility prediction remains a crucial challenge. This study aimed to develop machine learning models that can accurately predict compound solubility in organic solvents. A dataset containing 5081 experimental temperature and solubility data of compounds in organic solvents was extracted and standardized. Molecular fingerprints
-
ChemTables: a dataset for semantic classification on tables in chemical patents J. Cheminfom. (IF 5.514) Pub Date : 2021-12-11 Zhai, Zenan, Druckenbrodt, Christian, Thorne, Camilo, Akhondi, Saber A., Nguyen, Dat Quoc, Cohn, Trevor, Verspoor, Karin
Chemical patents are a commonly used channel for disclosing novel compounds and reactions, and hence represent important resources for chemical and pharmaceutical research. Key chemical data in patents is often presented in tables. Both the number and the size of tables can be very large in patent documents. In addition, various types of information can be presented in tables in patents, including
-
Splitting chemical structure data sets for federated privacy-preserving machine learning J. Cheminfom. (IF 5.514) Pub Date : 2021-12-07 Simm, Jaak, Humbeck, Lina, Zalewski, Adam, Sturm, Noe, Heyndrickx, Wouter, Moreau, Yves, Beck, Bernd, Schuffenhauer, Ansgar
With the increase in applications of machine learning methods in drug design and related fields, the challenge of designing sound test sets becomes more and more prominent. The goal of this challenge is to have a realistic split of chemical structures (compounds) between training, validation and test set such that the performance on the test set is meaningful to infer the performance in a prospective
-
Exploration and augmentation of pharmacological space via adversarial auto-encoder model for facilitating kinase-centric drug development J. Cheminfom. (IF 5.514) Pub Date : 2021-12-06 Bai, Xinyu, Yin, Yuxin
Predicting compound–protein interactions (CPIs) is of great importance for drug discovery and repositioning, yet still challenging mainly due to the sparse nature of CPI matrixes, resulting in poor generalization performance. Hence, unlike typical CPI prediction models focused on representation learning or model selection, we propose a deep neural network-based strategy, PCM-AAE, that re-explores and
-
MERMAID: an open source automated hit-to-lead method based on deep reinforcement learning J. Cheminfom. (IF 5.514) Pub Date : 2021-11-27 Erikawa, Daiki, Yasuo, Nobuaki, Sekijima, Masakazu
The hit-to-lead process makes the physicochemical properties of the hit molecules that show the desired type of activity obtained in the screening assay more drug-like. Deep learning-based molecular generative models are expected to contribute to the hit-to-lead process. The simplified molecular input line entry system (SMILES), which is a string of alphanumeric characters representing the chemical
-
Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network J. Cheminfom. (IF 5.514) Pub Date : 2021-11-27 Chen, Jiarui, Si, Yain-Whar, Un, Chon-Wai, Siu, Shirley W. I.
As safety is one of the most important properties of drugs, chemical toxicology prediction has received increasing attentions in the drug discovery research. Traditionally, researchers rely on in vitro and in vivo experiments to test the toxicity of chemical compounds. However, not only are these experiments time consuming and costly, but experiments that involve animal testing are increasingly subject
-
The effect of noise on the predictive limit of QSAR models J. Cheminfom. (IF 5.514) Pub Date : 2021-11-25 Kolmar, Scott S., Grulke, Christopher M.
A key challenge in the field of Quantitative Structure Activity Relationships (QSAR) is how to effectively treat experimental error in the training and evaluation of computational models. It is often assumed in the field of QSAR that models cannot produce predictions which are more accurate than their training data. Additionally, it is implicitly assumed, by necessity, that data points in test sets
-
Development of a chemogenomics library for phenotypic screening J. Cheminfom. (IF 5.514) Pub Date : 2021-11-24 Dafniet, Bryan, Cerisier, Natacha, Boezio, Batiste, Clary, Anaelle, Ducrot, Pierre, Dorval, Thierry, Gohier, Arnaud, Brown, David, Audouze, Karine, Taboureau, Olivier
With the development of advanced technologies in cell-based phenotypic screening, phenotypic drug discovery (PDD) strategies have re-emerged as promising approaches in the identification and development of novel and safe drugs. However, phenotypic screening does not rely on knowledge of specific drug targets and needs to be combined with chemical biology approaches to identify therapeutic targets and
-
Unexpected similarity between HIV-1 reverse transcriptase and tumor necrosis factor binding sites revealed by computer vision J. Cheminfom. (IF 5.514) Pub Date : 2021-11-23 Eguida, Merveille, Rognan, Didier
Rationalizing the identification of hidden similarities across the repertoire of druggable protein cavities remains a major hurdle to a true proteome-wide structure-based discovery of novel drug candidates. We recently described a new computational approach (ProCare), inspired by numerical image processing, to identify local similarities in fragment-based subpockets. During the validation of the method
-
DockStream: a docking wrapper to enhance de novo molecular design J. Cheminfom. (IF 5.514) Pub Date : 2021-11-17 Guo, Jeff, Janet, Jon Paul, Bauer, Matthias R., Nittinger, Eva, Giblin, Kathryn A., Papadopoulos, Kostas, Voronov, Alexey, Patronov, Atanas, Engkvist, Ola, Margreitter, Christian
Recently, we have released the de novo design platform REINVENT in version 2.0. This improved and extended iteration supports far more features and scoring function components, which allows bespoke and tailor-made protocols to maximize impact in small molecule drug discovery projects. A major obstacle of generative models is producing active compounds, in which predictive (QSAR) models have been applied
-
Molecular generation by Fast Assembly of (Deep)SMILES fragments J. Cheminfom. (IF 5.514) Pub Date : 2021-11-14 Berenger, Francois, Tsuda, Koji
In recent years, in silico molecular design is regaining interest. To generate on a computer molecules with optimized properties, scoring functions can be coupled with a molecular generator to design novel molecules with a desired property profile. In this article, a simple method is described to generate only valid molecules at high frequency ( $$>300,000$$ molecule/s using a single CPU core), given
-
Deep scaffold hopping with multimodal transformer neural networks J. Cheminfom. (IF 5.514) Pub Date : 2021-11-13 Zheng, Shuangjia, Lei, Zengrong, Ai, Haitao, Chen, Hongming, Deng, Daiguo, Yang, Yuedong
Scaffold hopping is a central task of modern medicinal chemistry for rational drug design, which aims to design molecules of novel scaffolds sharing similar target biological activities toward known hit molecules. Traditionally, scaffolding hopping depends on searching databases of available compounds that can't exploit vast chemical space. In this study, we have re-formulated this task as a supervised
-
Semi-automated workflow for molecular pair analysis and QSAR-assisted transformation space expansion J. Cheminfom. (IF 5.514) Pub Date : 2021-11-13 Yang, Zi-Yi, Fu, Li, Lu, Ai-Ping, Liu, Shao, Hou, Ting-Jun, Cao, Dong-Sheng
In the process of drug discovery, the optimization of lead compounds has always been a challenge faced by pharmaceutical chemists. Matched molecular pair analysis (MMPA), a promising tool to efficiently extract and summarize the relationship between structural transformation and property change, is suitable for local structural optimization tasks. Especially, the integration of MMPA with QSAR modeling
-
DrugEx v2: de novo design of drug molecules by Pareto-based multi-objective reinforcement learning in polypharmacology J. Cheminfom. (IF 5.514) Pub Date : 2021-11-12 Liu, Xuhan, Ye, Kai, van Vlijmen, Herman W. T., Emmerich, Michael T. M., IJzerman, Adriaan P., van Westen, Gerard J. P.
In polypharmacology drugs are required to bind to multiple specific targets, for example to enhance efficacy or to reduce resistance formation. Although deep learning has achieved a breakthrough in de novo design in drug discovery, most of its applications only focus on a single drug target to generate drug-like active molecules. However, in reality drug molecules often interact with more than one
-
MS2DeepScore: a novel deep learning similarity measure to compare tandem mass spectra J. Cheminfom. (IF 5.514) Pub Date : 2021-10-29 Huber, Florian, van der Burg, Sven, van der Hooft, Justin J. J., Ridder, Lars
Mass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are generally considered to be characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural
-
QSPR modeling of selectivity at infinite dilution of ionic liquids J. Cheminfom. (IF 5.514) Pub Date : 2021-10-26 Klimenko, Kyrylo, Carrera, Gonçalo V. S. M.
The intelligent choice of extractants and entrainers can improve current mixture separation techniques allowing better efficiency and sustainability of chemical processes that are both used in industry and laboratory practice. The most promising approach is a straightforward comparison of selectivity at infinite dilution between potential candidates. However, selectivity at infinite dilution values
-
Classifying natural products from plants, fungi or bacteria using the COCONUT database and machine learning J. Cheminfom. (IF 5.514) Pub Date : 2021-10-18 Capecchi, Alice, Reymond, Jean-Louis
Natural products (NPs) represent one of the most important resources for discovering new drugs. Here we asked whether NP origin can be assigned from their molecular structure in a subset of 60,171 NPs in the recently reported Collection of Open Natural Products (COCONUT) database assigned to plants, fungi, or bacteria. Visualizing this subset in an interactive tree-map (TMAP) calculated using MAP4
-
The impact of cross-docked poses on performance of machine learning classifier for protein–ligand binding pose prediction J. Cheminfom. (IF 5.514) Pub Date : 2021-10-16 Shen, Chao, Hu, Xueping, Gao, Junbo, Zhang, Xujun, Zhong, Haiyang, Wang, Zhe, Xu, Lei, Kang, Yu, Cao, Dongsheng, Hou, Tingjun
Structure-based drug design depends on the detailed knowledge of the three-dimensional (3D) structures of protein–ligand binding complexes, but accurate prediction of ligand-binding poses is still a major challenge for molecular docking due to deficiency of scoring functions (SFs) and ignorance of protein flexibility upon ligand binding. In this study, based on a cross-docking dataset dedicatedly constructed
-
Individual and collective human intelligence in drug design: evaluating the search strategy J. Cheminfom. (IF 5.514) Pub Date : 2021-10-11 Cincilla, Giovanni, Masoni, Simone, Blobel, Jascha
In recent years, individual and collective human intelligence, defined as the knowledge, skills, reasoning and intuition of individuals and groups, have been used in combination with computer algorithms to solve complex scientific problems. Such approach was successfully used in different research fields such as: structural biology, comparative genomics, macromolecular crystallography and RNA design