Introduction

Coronaviruses (CoVs) induced three major outbreaks of respiratory distress syndrome in the last decades including severe acute respiratory syndrome (SARS) in 2003 with an epicenter in Guangdong, China [1], Middle East respiratory syndrome (MERS) in 2012 in Saudi Arabia [2], and novel coronavirus disease (COVID-19) firstly observed in Wuhan province, China, in late 2019 [3]. The COVID-19 symptoms vary from mild to severe that can eventually lead to death. The symptoms usually develop after 2–14 days following the viral exposure which includes fever, cough, shortness of breath and pneumonia. Severe cases showed respiratory, gastrointestinal, hepatic and neurological complications that can lead to death. The human-to-human transmission of COVID-19 is reported to occur by respiratory droplets or direct contact with the patients [4, 5].

World Health Organization (WHO) announced COVID-19 as a pandemic on 11th March 2020 [6]. The number of infected cases and deaths by COVID-19 is sharply rising daily. According to the WHO report, as of 18th April 2020, COVID-19 has spread to more than 200 countries and territories with over 2,160,207 confirmed cases and over 146,088 confirmed deaths worldwide. The global fatality rate is presently around 6.76% (calculated as deaths per confirmed cases). More and more number of positive cases has been identified in the USA followed by Spain, Italy, Germany, France, UK, China and Iran [7].

CoVs are a group of genotypically and phenotypically diverse, enveloped and positive-sense viruses carrying single-stranded RNA. They belong to the Coronaviridae family that induces respiratory, enteric, hepatic and neurological complications of varying severity in a wide range of animal species and humans [8]. CoVs are divided into four types: α-coronavirus (α-CoV), β-coronavirus (β-CoV), γ-coronavirus (γ-CoV) and δ-coronavirus (δ-CoV). Novel coronavirus (2019-nCoV) also known as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of COVID-19 [9]. SARS-CoV-2, like SARS-CoV and MERS-CoV, belongs to the category of β-coronavirus. Although it is considered to be introduced from bats, the specific source of SARS-CoV-2, animal reservoir and enzootic patterns of transmission remain unresolved.

The lifecycle of SARS-CoV-2 in host cells is briefly discussed here (Fig. 1). The SARS-CoV-2 virus targets the host cells through the viral spike (S) protein. This S protein binds to the ACE2 receptor of the host cells. After the S protein-ACE2 binding, the virus utilizes the host cell receptors (TMPRSS2) and enters into host cell cytosol. After uncoating, the viral gRNA is released into the cytoplasm. Once inside the host cell, viral polypeptides are synthesized using the host cell protein synthesis machinery, which are further processed by viral proteases and the products transferred to RTC. The virus then synthesizes RNA using its RdRP. Viral structural proteins and assembly proteins are also synthesized leading to the completion of assembly and release of progeny viral particles by exocytosis [10, 11]. Generally, β-CoVs produce pp1a (450–500 kDa) and pp1ab (750–800 kDa) by translation of the genomic RNA [12]. These polypeptides are proteolytically processed into structural and non-structural proteins which are important for the replication and assembly of the virus. This proteolytic cleavage is mediated by the main protease (Mpro) also known as 3-chymotrypsin-like protease (3CLpro) and by papain-like protease (PLpro). Mpro is homodimeric cysteine protease which cleaves the polyprotein at 11 distinct sites generating vital proteins for the viral life cycle. This functional prominence of Mpro in the viral life cycle, along with the lack of closely associated homologues in humans, makes the Mpro a promising target for COVID-19 antiviral drug design [13,14,15,16,17].

Fig. 1
figure 1

Schematic representation of SARS-CoV-2 life cycle in host cells. gRNA genomic RNA, ACE2 angiotensin-converting enzyme 2, TMPRSS2 type 2 transmembrane serine protease, RTC replicase transcriptase complex, RdRP RNA-dependent RNA polymerase, sgRNA subgenomic RNA

The conventional drug discovery approaches offering new drugs into the market demands a long period of time, resources and a huge amount of money with a low success rate. Drug repurposing is an approach of screening of libraries of FDA-approved drugs to identify drug molecules that could be effective for a particular disease. Repurposing FDA-approved drugs hastens the drug development processes, as most of the safety concerns, preclinical testing and formulation development have already been performed for these drugs. Many research groups have carried out the repurposing process, and to date many approved drugs are being tried in more than 100 clinical trial studies, but as of now, none of them turned out to be completely effective for COVID-19 treatment. Unfortunately, no vaccine has been developed/approved yet to combat COVID-19 [18]. To meet this urgent medical need for an effective drug for COVID-19, we planned to identify potential SARS-CoV-2 Mpro inhibitors as possible anti-COVID-19 agents, by carrying out screening of a drug library containing drugs and diagnostic agents (approved by FDA or other world authorities) [19, 20] and the Asinex BioDesign library [21]. In the present report, systematic screenings of these two libraries were carried out using pharmacophore and sequential conformational precision level filters using the Schrodinger Suite. The resulted scaffolds, mainly from Asinex BioDesign library, are discussed in detail that could be utilized as potential hits for further drug development to find out an acceptable drug treatment of COVID-19.

Results and discussion

To accomplish the goal, the protein structure of cysteine protease Mpro was obtained from RCSB (PDB code: 6LU7) [14] and the covalent bond between the co-crystallized ligand N3 and the protein catalytic dyad Cys145 was cleaved. After cleavage, Cys145 and the ligand were reconstructed by making necessary changes and the resulted ligand–protein complex was prepared and refined using Protein Preparation Wizard [22] in Schrodinger. This cysteine protease Mpro having 306 amino acids chain mainly consists of three domains. Domain 1 is from residue 8–101, domain 2 is from 102 to 184, and the domain 3 is from 201 to 303 amino acid sequence connected to domain 2 by a loop having 185–200 amino acid residues. The substrate-binding site on this viral protein is present in a cleft between domains 1 and 2 with a Cys145-His41 catalytic dyad. The major active subsites in the active site of Mpro, where the substrate binds, are defined. The S1 subsite is comprised of Phe140, Leu141, Asn142, His163, Glu166 and His172 amino acids. A small part of S1 subsite is further separated by Asn142 as S1′ which is comprised of Thr25, Thr26 and Leu27. Hydrophobic S2 subsite is made up of His41, Met49, Tyr54, Met165 and Asp187. The S4 binding subsite involves Met165, Leu167, Phe185, Gln189 and Gln192 amino acids [14]. Receptor grid was generated on the active site of Mpro protein within Schrodinger Suite. Glide module of Schrodinger was used to carry out the docking study at various precision levels, which include high-throughput virtual screening (HTVS, low precision level), standard precision (SP) and extra-precision (XP). Depending on the precision level applied, Glide internally generates multiple conformations of the ligands under study using ConfGen package. To validate the generated grid, the co-crystallized ligand (N3) was first knocked out, reconstructed and re-docked into the active site of the receptor using the generated grid. Here, the N3 molecule showed a similar pattern of orientation and interactions such as hydrogen bonding with Glu166, Gln189 and Thr190 residues of the active site. The XP docking score between N3 and Mpro protein was − 7.85 kcal/mol. This cognate docking was analyzed further by identifying the all-atom RMSD value of the re-docked N3 ligand vis-a-vis the co-crystallized ligand, and it was found to be 0.452 Å, which validated the docking protocol.

To identify the possible inhibitors from both the libraries selected for this study, initially, a pharmacophore-based filter was applied. Pharmacophore is the ensemble of steric and electronic features that are necessary to ensure the best possible interactions with a biological target active site. The generation of ligand-receptor/protein complex-based pharmacophore specifies only the important steric and electrostatic features present in that particular ligand, which is co-crystallized with the protein structure. Thus the development of a ligand-receptor/protein complex-based pharmacophore model carries the highest chances of missing out other possible sites for interactions present in the protein active site. Also, to generate a ligand-based pharmacophore model, sufficient number of ligands, with some scaffold similarity, acting against the same target is required which are not available for the target under study to date. Thus, we generated a receptor active site-based pharmacophore model which considers all possible interaction sites in the binding pocket. The inbuilt fragment library was allowed to dock in the grid generated on the protein active site which checked the interactions of fragments with the residues present in all subsites of the active site, and based on extra-precision (XP) docking scoring function, seven feature-based best pharmacophore hypothesis (AADRRRR) was obtained (Fig. 2). One of the acceptor features was observed in the proximity of His41 with hydrogen on the epsilon nitrogen and Cys145. Another acceptor feature was located near Glu166. The donor feature was observed in close proximity to the carbonyl group of His164. One aromatic feature was observed in the S1 subsite region, whereas one was observed protruding to S1′ subsite, one in the S2 subsite region and one in the S4 subsite region of the protein active site explaining these features as either aromatic or hydrophobic in nature.

Fig. 2
figure 2

Seven feature pharmacophore hypothesis (AADRRRR). (A) Hydrogen bond acceptors, pink spheres with arrows; (D) hydrogen bond donor, sky blue sphere with arrows; (R) ring aromatics, orange torus rings

The resulted hypothesis was checked by aligning the N3 ligand on the pharmacophore hypothesis. Overall, the N3 ligand matched with four pharmacophore features (Fig. 3). One of the acceptor features was observed to be occupied by carbonyl oxygen of N3 which formed a hydrogen bond with NH of Glu166. One of the hydrophobic/aromatic features was occupied by the isobutyl chain of N3 ligand. The (2-oxopyrrolidin-3-yl)methyl moiety of N3 was observed to occupy another hydrophobic/aromatic feature. The but-2-enyl hydrophobic part of N3 ligand inhabited one hydrophobic/aromatic feature. This same research publication [14] has reported six other ligands acting on Mpro. So, these molecules were also assessed on the developed pharmacophore model and were found to match very promisingly with the developed model. These observations validated the developed pharmacophore model. (Data for the other six ligands are shown in the supporting information.)

Fig. 3
figure 3

(A) 2D structure of N3 ligand; (B) alignment of N3 ligand with the developed pharmacophore model

Using this hypothesis as the first screening filter, the systematic screening of libraries selected under this study was carried out and the finally obtained molecules are discussed here.

The approved drugs and diagnostic agents library was first screened using the developed pharmacophore model followed by the docking protocols using various precision levels (mentioned in experimental section). In the systematic virtual screening, three antiviral drugs were obtained from the approved drug library. All these three drugs ritonavir (1), nelfinavir (2) and saquinavir (3) are viral protease inhibitors. Out of these, ritonavir (1) is already in phase II, III and IV clinical trials in combination with some other drug candidates (Phase II: NCT04330690, NCT04276688, NCT04321993, NCT04330586, NCT04332094; Phase III: NCT04261270, NCT04321174, NCT04334928; Phase IV: NCT04291729, NCT04255017, NCT02735707, NCT04286503) [23].

The (thiazoly-5-yl)methyl carbamate oxygen of ritonavir (1) was observed engaged in forming hydrogen bonds with Gly143 and Cys145, whereas the thiazolyl ring was observed aligned with polar Thr25, Thr26 and nonpolar Leu27 amino acids of S1′ subsite. The phenyl ring on the 1st carbon of hexane was observed to be stabilized in the S1 binding pocket, and the phenyl ring present on the 6th carbon of hexane was observed in the S2 binding pocket and made ππ interactions with His41. Apart from these interactions, the ligand-receptor/protein complex was further stabilized by hydrogen bonding between the carbonyl oxygen of the butanoyl amino group with Glu166 and NH of the same butanoyl amino group with Gln189 (Fig. 4A).

Fig. 4
figure 4

Docking interactions of compounds (16: AF) in the active sites of Mpro. (A) ritonavir, (B) nelfinavir, (C) saquinavir, (D) pralmorelin, (E) iodixanol, (F) iotrolan. Ligands are shown as cyan balls and sticks. Mpro residues are shown as atom type color sticks. Hydrogen bonds formed between ligands and receptor are indicated by red lines

In the case of nelfinavir (2), protonated nitrogen of decahydroisoquinoline, NH of N-substituted carboxamide and 2-hydroxyl group stabilized the nelfinavir-protein complex by interacting with Glu166 through strong hydrogen bonds. The phenylsulfanyl group of nelfinavir (2) was stabilized in the S2 subsite with His41 and Tyr54 amino acids. Additionally, the 3-hydroxy-2-methylphenylformamido group occupied the S1 subsite of the protein active site (Fig. 4B). Similarly in saquinavir (3), NH of tert-butylcarbamoyl, one of the carbonyl oxygen and NH of butanediamide stabilized saquinavir within the active site of the protein by forming hydrogen bond with Glu166. Quinoline nitrogen of saquinavir further provided stability to the complex by forming a hydrogen bond with Gly143, whereas the phenyl group occupied hydrophobic S2 subsite (Fig. 4C).

Pralmorelin (4) also known as growth hormone-releasing peptide 2, used as a diagnostic agent for the assessment of growth hormone deficiency, exhibited promising binding affinity with Mpro protein. The ligand–protein complex was found to be stabilized by nine strong hydrogen bonds and one salt bridge between pralmorelin and Mpro active site. The hydrogen bonds are mostly with Glu166, Phe140 and His164. The salt bridge was observed between the quaternary terminal nitrogen of pralmorelin and Glu166. Additionally, the phenyl ring occupied the S2 subsite (Fig. 4D).

Apart from these identified anti-viral agents which are already in use as HIV protease inhibitors, some promising interactions of radiocontrast agent iodixanol (5) and iotrolan (6) were also observed with the target protein under study (Fig. 4E, F, respectively). Both of these are iodine-containing radiocontrast agents injected in the body for visualization in various clinical processes. The iodine-containing drugs carry major limitation for their use, mainly because of adverse effects such as hypersensitivity reactions, cardiovascular, gastrointestinal and neuronal complications (Table 1).

Table 1 Chemical structures and docking results of the hits obtained from the screening of approved drug library (approved by FDA or different world authorities)

Along with ritonavir, which is already in clinical trials, nelfinavir, saquinavir and pralmorelin can be considered as potential drug candidates for the treatment of COVID-19. Apart from these, the structural modification in iodixanol and iotrolan could lead to stable and safer antiviral agents. Three representative compounds (M1–M3) having modifications in the iodixanol structure are suggested with their respective docking scores in the supporting information. Substitution of the iodo groups in iodixanol with hydrogens could make them stable and safer derivatives, but at the same time, their affinities for the Mpro got compromised a bit (still having promising binding affinities) as shown in Table S2.

While repurposing of drugs used in other indications may help to find a faster treatment for COVID-19, it will need a concerted effort to discover a new molecule that specifically targets this virus. Asinex BioDesign library is constructed with 175,815 molecules. This library consists of molecules from natural products which are having pharmacologically important structural features in their chemical scaffolds. To screen this database, the under given protocol was followed. Asinex library of 175,815 compounds was retrieved from the Asinex database and compounds were prepared using LigPrep module for the structural optimization using the OPLS3e force field. Pharmacophore-based screening was performed using this library by Phase module of Schrodinger to obtain Mpro inhibitors with the desired chemical features. A minimum of four-feature matching criteria was set to qualify over the generated seven feature AADRRRR hypothesis. The obtained hits from this screening were ranked orderly by the Phase screen score. From the Asinex library, 26,001 molecules qualified the pharmacophore-based screening criteria. These hits were systematically screened in silico using sequential conformational precision approaches which included HTVS, SP and XP docking of Schrodinger suit on the Mpro protein, and the molecules obtained in the final step were analyzed manually for their predicted binding modes and scores. Top 20 molecules amongst them are discussed here. All these 20 molecules are divided into four classes based on their chemical structures. All the structures along with their ranking and docking scores (kcal/mol) are mentioned in following section category-wise.

3,5-Disubstituted pyrazole-based derivatives (711)

The highest-ranked compound (7) was pyrazole-based derivative with substituted phenyl rings at C3 and C5 positions. Analysis of the ligand-receptor complex showed that compound (7) fits snugly into the substrate-binding site (Fig. 5A). The ligand-receptor complex was stabilized by hydrogen bonding between hydroxyl groups of compound (7) and protein. The C3 phenyl ring occupied the S4 subsite of Mpro protein, and the hydroxyl groups at the C3 phenyl ring were found to interact with Thr190 by strong hydrogen bonds. The C5 phenyl ring was observed in alignment with S1 subsite of the protein active site and the hydroxyl group at the C5 phenyl ring imparted stability to the complex by forming a hydrogen bond with Leu141. Similar type of interaction was observed between other pyrazole derivatives (8, 9, and 10) with the active site of Mpro (Fig. 5B–D, respectively). Compound (11) showed a slightly different interaction than other pyrazole derivatives as it lacked a hydroxyl group in its structure (Fig. 5E). In compound (11), the ligand-receptor complex was stabilized by five hydrogen bond interactions between N2 nitrogen of pyrazole and Glu166, the nitrogen of acetamide and Phe140, Glu166, oxygen of acetate with Cys145, Gly143. Phenyl ring attached at N1 of pyrazole was located in S2 hydrophobic subsite and showed ππ interactions with His41 residue, whereas the phenyl ring attached to the C3 position of pyrazole was observed to be aligned with S1 subsite and imparted stability by forming ππ interaction with His163 residue (Table 2).

Fig. 5
figure 5

Docking interactions of compounds (711: AE) in the active sites of Mpro. (A) Compound (7), (B) compound (8), (C )compound (9), (D) compound (10) and (E) compound (11). Ligands are shown as cyan balls and sticks. Mpro residues are shown as atom-type color sticks. Hydrogen bonds formed between ligands and receptor are indicated by red lines

Table 2 Chemical structures and docking results of the disubstituted pyrazole-based final hits (711) obtained from the screening of Asinex BioDesign library

Cyclic amide derivatives (1220)

Some cyclic amides also offered promising interactions with the active site of Mpro. Among the cyclic amides, the most promising compound (12) fits comfortably into the substrate-binding pocket (Fig. 6A). The ligand-receptor complex was stabilized by favorable hydrogen bonding as well as ππ interactions between compound (12) and Mpro protein. Amide oxygen atoms exhibited hydrogen bonding with Gly143 and Ser46 residues, whereas two of the amide NH showed similar type of interactions with Asn142 residue. Additional stability to the ligand–protein complex was provided by ππ interaction between fluorophenyl ring and His41 residue of the active site. Here, the phenoxy phenyl moiety of compound (12) was found to occupy the S4 subsite of Mpro protein and provided additional stability to the complex. The next promising molecule from cyclic amide was compound (13); it showed hydrogen bonding between amide oxygen and Gly143 residue, amide nitrogen and His164 residue and between methoxyl group oxygen and Ser46 residue. Benzyl group present in the cyclic amide occupied the S4 binding subsite of Mpro (Fig. 6B). In a similar fashion, in compound (14) two amide oxygens showed hydrogen bonding each with Ser46 and Gly143 residues, two of the amide NH interacted in the same way with Asn142 residue (Fig. 6C). In a similar fashion to that of compound (12), the phenoxyphenyl group of 15 was observed in the S4 subsite. In compound (15), two cyclic carbonyl oxygen atoms of amide explained the stability of the complex by hydrogen bonding with Asn142 and His41 residues, whereas NH of the noncyclic amide interacted with Gln189 residue by means of a hydrogen bond. Additional stability was provided by interactions between p-hydroxyphenyl ring and His41. In this molecule, p-hydroxyphenyl moiety was observed to occupy the S4 binding subsite (Fig. 6D). Likewise, compound (16) showed hydrogen bonding between hydroxyl group oxygen and Leu141 residue, three NH of the amides and Asn142, GLN189 residues and a stable ππ interaction between benzyl ring and His41 residue of the active site. Here, the 4-fluorophenyl ring occupied the S1 subsite, benzyl ring occupied the S2 subsite, and the benzamide moiety was observed in alignment with the S4 subsite of the protein (Fig. 6E). Hydrogen bonding in compound (17) was observed between amide nitrogen and Asn142, Gln189 residues and ππ interactions between benzyl ring and His41. The subsite occupancy of 17 was similar to that of compound (16) (Fig. 6F). Compound (18) showed similar interactions as that of the compound (14) (Fig. 6G). In this, additional stability was provided by formation of a halogen bond between the chloro group attached to the phenyl ring and NH of Gln192. Compound (20) was oriented in a similar fashion to that of compound (13), but it showed slightly different interactions (Fig. 6I). Here, hydroxyl group interacted with Hip164 and phenoxy oxygen with Glu166 by hydrogen bonding additionally. The noncyclic amide oxygen also showed hydrogen bond interaction with Cys145. These cyclic amide derivatives displayed significant interactions within the active site of the Mpro enzyme, shaping them potential candidates for further investigation to find out novel anti-SARS-CoV-2 drugs (Table 3).

Fig. 6
figure 6

Docking interactions of compounds (1220; AI) in the active sites of Mpro. (A) Compound (12), (B) compound (13), (C) compound (14), (D) compound (15), (E) compound (16), (F) compound (17), (G) compound (18), (H) compound (19) and (I) compound (20). Ligands are shown as cyan balls and sticks. Mpro residues are shown as atom type color sticks. Hydrogen bonds formed between ligands and receptor are indicated by red lines

Table 3 Chemical structures and docking results of the cyclic amide based final hits (1220) obtained from the screening of Asinex BioDesign library

Pyrrolidine-based derivatives (2124)

Out of the top twenty best scoring compounds, four compounds were observed with pyrrolidine scaffold. Among them, oxygen and NH from two different amide groups of compound (21) interacted with Gly166 by hydrogen bonding (Fig. 7A). Further stability to the ligand-receptor complex was provided by hydrogen bond interaction between the nitrogen of pyrazine ring and Cys145. Here, the m-fluorobenzyl group occupied hydrophobic S2 subsite, whereas pyrrolidinyl pyrazine scaffold occupied the S1 subsite and benzyl group occupied the S4 subsite for interaction. In compound (22), hydrogen bond interaction was observed between the hydroxyl group and Gln189, amide oxygen and Glu166 and a salt bridge between the quaternary amine group and Glu166 (Fig. 7B). Here, p-fluorophenylpyrrolidinol scaffold occupied the S2 subsite and the junction of S1-S2 subsite, and p-fluorobenzyl amide was observed in alignment with S4 subsite. Likewise, compound (23) showed hydrogen bond between the hydroxyl group and Asn142, amide oxygen and His41, amide nitrogen and Gln189. It also exhibited ππ interactions between the benzamide ring and His41. Additional stability to this ligand-receptor complex was provided by hydrogen bond and salt bridge interaction between the nitrogen of the naphthyridine ring and Glu166. Here, the aminoethoxy benzamide was observed to align with the S4 subsite and naphthyridine occupied the S1 subsite for interaction (Fig. 7C). The carboxamide oxygen of compound (24) interacted with His41 by a hydrogen bond, whereas one of the phenolic rings of N-benzhydryl showed ππ interaction with the same amino acid. Additionally, the quaternary amine of aminopropoxy benzamide interacted with Glu166 through hydrogen bonding and salt bridge formation, and with Phe140 by hydrogen bonding. Here, the aminopropoxy benzamide occupied the S1 subsite and N-benzhydryl moiety occupied the S4 subsite of interaction (Fig. 7D). Commercial availability (Aurora Fine Chemicals LLC-USA) and significant interactions within the active site of the Mpro protein suggest these pyrrolidine-based derivatives as promising candidates for further investigation (Table 4).

Fig. 7
figure 7

Docking interactions of compounds (2124; AD) in the active sites of Mpro. (A) Compound (21), (B) compound (22), (C) compound (23) and (D) compound (24). Ligands are shown as cyan balls and sticks. Mpro residues are shown as atom type color sticks. Hydrogen bonds formed between ligands and the receptor are indicated by red lines

Table 4 Chemical structures and docking results of the pyrrolidine-based final hits (2124) obtained from the screening of Asinex BioDesign library

Miscellaneous derivatives (25, 26)

Compound (25) showed hydrogen bonding between the hydroxyl group and Glu166 residue, and amide oxygen and Gly143 residue of the active site. It also showed hydrogen bonding between nitrogen of the homopiperazine ring and Gln189 residue of the active site (Fig. 8A). Similarly, compound (26) showed hydrogen bonding between amide oxygen and Glu166 and His41 residues, and between the nitrogen of the pyridine ring and Gly143 residue of the active site (Fig. 8B, Table 5).

Fig. 8
figure 8

Docking interactions of compounds (25, 26; (A), (B)) in the active sites of Mpro. (A ) Compound (25) and (B) compound (26). Ligands are shown as cyan balls and sticks. Mpro residues are shown as atom type color sticks. Hydrogen bonds formed between ligands and receptor are indicated by red lines

Table 5 Chemical structures and docking results of the miscellaneous final hits (25, 26) obtained from the screening of Asinex BioDesign library

Molecular dynamics

The final hits obtained from the Asinex database were divided into four groups based on the type of scaffolds present in the structures, and their interactions have been discussed above. Further to understand the time-dependent stability of the complexes between the first molecule from each discussed group and Mpro protein structure, molecular dynamics (MD) study was carried out. MD study was performed for a period of 10 ns using Gromacs2020.1 package. Here, considering the docked pose of the ligand-receptor as the reference frame, MD study was performed and various statistical parameters such as RMSD-P, RMSF-P, RMSD-L (P = protein, L = ligand), H-bonding, Solvent Accessible Surface Area (SASA) etc. were determined.

The protein RMSD-P is studied to understand the degree of movement of protein or atoms while putting the ligand in the active site and assessing the structural stability, deviation and conformations of the protein over the simulation time period. The RMSD-P for Mpro in complexation with compound (7) was in the range of 0.08–0.2 with an average of 0.15 nm (Fig. 9A). This suggests the stability of the protein while having compound (7) in the active site over this period of time. Despite having freely rotatable bonds, the average RMSD-L was 0.35 nm with a range of 0.2 nm to 0.5 nm after 1 ns (Fig. 9B). This strongly suggests the protein–ligand stability without a major change in orientation of the ligand in the active site over the period of simulation time. RMSF explains the structural integrity and residual mobility of the structure. Here again, all the residues including terminal residues and loop regions of the protein structure, while having compound (7) in the active site, showed RMSF-P below 0.41 nm (Fig. 9C).

Fig. 9
figure 9

RMSD-P, RMSD-L, RMSF-P, SASA and H-bond plots for Mpro with compound (7)

Hydrogen bonds between protein and compound (7) over the period of analysis were determined with Gromacs g_hbond utility. Maximum seven hydrogen bonds were observed during MD simulation, whereas three to four hydrogen bonds were observed consistently throughout the simulation time (Fig. 9E). Apart from these observations SASA analysis also supported the stability of the protein–ligand complex (Fig. 9D). The short-range electrostatic (Coul-SR) and van der Waals/hydrophobic (LJ-SR) interaction energies between protein and compound (7) explained promising electrostatic as well as hydrophobic interactions. The average values of Coul-SR − 77.17 ± 6.7 kJ/mol and LJ-SR − 136.45 ± 4.6 kJ/mol were observed. This explains that the role of hydrophobic interaction was more important than the electrostatic interactions in stabilizing the complex.

Similar evaluation was done for Mpro with compound (12). The RMSD-P was observed in the range of 0.1–0.38 nm with an average of 0.24 nm (Fig. 10A). Despite having multiple rotatable bonds and the cyclic amide structure containing a large ring, the RMSD-L for the first ns was observed in the range of 0.18–0.33, and 2 ns onwards it was consistently observed in the range of 0.26–0.5 nm (Fig. 10B). The observed RMSF-P for the residues up to 300 was below 0.5 nm, whereas the protein tail above residue number 300 showed fluctuation up to 1.5 nm (Fig. 10C). Overall, this ligand-receptor complex showed a maximum of five hydrogen bonds, whereas three to four H-bonds were consistently observed over the period of time (Fig. 10E). The short-range electrostatic (Coul-SR, energy: − 87.66 ± 5.2 kJ/mol) and van der Waals/hydrophobic (LJ-SR, energy: − 179.46 ± 6.9 kJ/mol) interaction energies suggested promising interactions between the ligand and the protein. The contribution of hydrophobic interactions was found to be higher than that of the electrostatic interactions.

Fig. 10
figure 10

RMSD-P, RMSD-L, RMSF-P, SASA and H-bond plots for Mpro with compound (12)

For compound (21) in complexation with Mpro, the RMSD-P values in the range of 0.12–0.25 nm (average 0.16 nm) explained the stability of the protein while having compound (21) in the active site (Fig. 11A). Despite high flexibility of compound (21), the RMSD-L was observed in the range of 0.2–0.4 nm consistently except for the time duration from 0.8–1 and 2.2–2.8 ns where a sharp shoot in RMSD-L was observed up to 4.8 nm and it followed normalcy for the remaining period of simulation (Fig. 11B). Here, the overall RMSF-P value below 0.5 nm also supports the stability of the protein in the presence of the ligand in the active site (Fig. 11C). Initial MD analysis showed three hydrogen bonds in the complex, which then stabilized to two for the entire duration of simulation analysis consistently (Fig. 11E). Here again, the short-range electrostatic (Coul-SR, energy: − 74.09 ± 5 kJ/mol) and van der Waals/hydrophobic (LJ-SR, energy: − 146.23 ± 4.6 kJ/mol) interaction energies suggested promising interactions between the ligand and the protein. The contribution of hydrophobic interactions was found to be higher than that of the electrostatic.

Fig. 11
figure 11

RMSD-P, RMSD-L, RMSF-P, SASA and H-bond plots for Mpro with compound (21)

The RMSD-P value for Mpro-compound (25) complex was observed in the range of 0.1–0.24 nm (average 0.16 nm) suggesting the protein stability (Fig. 12A). The RMSD-L was in the range of 0.2–0.44 nm with an average value of 0.29 nm (Fig. 12B). Except for the terminal residues, the RMSF-P was below 0.25 nm and for the terminal residues, it was below 0.6 nm (Fig. 12C). This complex showed a maximum of three hydrogen bonds, whereas one-two hydrogen bonds were consistently present over the simulation period (Fig. 12E). Herein, the contribution of hydrophobic interactions was also observed to be more than the electrostatic interactions (observed values are Coul-SR, energy: − 78.84 ± 12 kJ/mol and LJ-SR, energy: − 161.74 ± 7.2 kJ/mol).

Fig. 12
figure 12

RMSD-P, RMSD-L, RMSF-P, SASA and H-bond plots for Mpro with compound (25)

Conclusion

To conclude, from the approved drug library, using a systematic screening approach, three anti-viral drugs ritonavir, nelfinavir, saquinavir, which are proven HIV protease inhibitors, were identified as promising drug candidates for the treatment of SARS-CoV-2. Among them, ritonavir is already in phase II/III and IV clinical trials for the treatment of COVID-19. Apart from ritonavir, nelfinavir and saquinavir can also be considered as potential drug candidates against the novel coronavirus. In addition to these, screening of approved drug library also suggested three diagnostic agents pralmorelin, iodixanol and iotrolan, interacting promisingly with Mpro protein. Pralmorelin is used as a diagnostic agent for the assessment of growth hormone deficiency, whereas iodixanol and iotrolan are iodine-containing radiocontrast agents. These three identified diagnostic agents carry many limitations, and thus structural modifications in these three candidates to overcome major limitations could lead to possible potential molecules. From the systematic virtual screening of Asinex BioDesign library, 20 molecules were identified interacting with viral Mpro protein; these are categorized as disubstituted pyrazoles, cyclic amides, pyrrolidine-based and miscellaneous derivatives. Results from Asinex BioDesign screening suggest high potential of these compounds as probable hits for use in the treatment of this devastating viral problem.

Experimental

Data collection and preparation

3D structure of Mpro was obtained from RCSB Protein Data Bank (PDB code: 6LU7) [24], prepared to ensure structural correctness for hydrogen consistency, bond orders, steric clashes and charges using protein preparation wizard [22] in Schrodinger Suite supported by OPLS3e force field. Thus, the prepared structures were used for pharmacophore model development as well as for receptor grid generation for the docking protocol. The receptor grid was generated considering the position of the co-crystallized ligand N3 in the active site.

The 3-dimensional approved drug and diagnostic agent library (approved by FDA and other world authorities) was obtained from SuperDRUG2 resource for approved drug [19, 20]. A total of 4369 molecules were prepared for computational study at physiological pH condition by using LigPrep module of Schrodinger [25]. Asinex BioDesign library [21], which consists of pharmacologically important structural features from natural products in the chemical scaffolds having synthetic feasibility, was obtained from Asinex database and 175,815 molecules (database updated in early 2020) were prepared using LigPrep module [25].

Pharmacophore development

Mpro protein structure is deposited in PDB (PDB code: 6LU7) with N3 peptide as the co-crystallized ligand having an IC50 value of 125 μM [14]. Being not so promising activity value, we planned to develop active site-based pharmacophore model. To identify the pharmacophoric features required for interaction, the co-crystallized ligand was knocked out and protein active site-based pharmacophore (e-pharmacophore) model was generated [26, 27]. This e-pharmacophore model was developed using Phase module of Schrodinger using the Glide [28] XP scoring for library of inbuilt fragments. These fragments were docked in the receptor grid generated on the active site. Various fragments with different pharmacophoric features were docked, and common features were chosen which satisfied the criteria of position and direction. On the basis of protein amino acids present, excluded volume was calculated to avoid false selection of ligands during screening of the database. The obtained seven feature (AADRRRR)-based pharmacophore model was used for virtual screening.

Virtual screening workflow

Two selected libraries (FDA-approved drug/diagnostic agents and Asinex BioDesign) were first screened using receptor-based pharmacophore model. A criterion for minimum four-feature match was set, and all the prepared molecules were screened on the obtained AADRRRR hypothesis. For each molecule, 50 conformers were allowed to be generated and energy minimized. The acceptor and negative as well as donor and positive features were considered equivalent, and scoring function was set to default Phase Screen Score. From the approved drug library, 2253 molecules qualified the pharmacophore-based screening criteria. To them, a considerably fast but raw screening was applied initially by using Glide HTVS and the top 25% molecules were singled out for standard precision (SP) docking in Glide, and from that top 50% were carried forward for extra-precision (XP) analysis. From the XP analysis, top 15% of the compounds were considered further and analyzed manually for their predicted binding modes and scores.

From the Asinex BioDesign library, 26,001 molecules qualified the screen initially, using the pharmacophore-based filter. These compounds were then first passed through HTVS screen and the top 10% molecules were chosen for SP docking, and from these compounds top 25% molecules were moved forward for XP docking studies. Finally, the top 15% molecules were selected for manual analysis of their predicted binding modes and scores (Fig. 13).

Fig. 13
figure 13

Schematic representation of virtual screening workflow

Molecular dynamics

The MD studies between the selected compounds and Mpro (PDB code: 6LU7) were performed for a period of 10 ns by using GROMACS 2020.1 software [29]. The ligand-enzyme complexes as such acquired from the docking studies were considered as the initial points for simulation. To establish the complex stability, CHARMM36 all-atom force field [30] was used to establish the complex stability and ligand-receptor parameters were calculated using GROMACS. CGenFF server was used to generate the ligand topology [31, 32], and the ligand–protein complexes were prepared and solvated using the tip3p/SPC216 water model [33]. Na+ or Cl ions were used to neutralize the system. All the complexes were primarily energy minimized with the steepest descent method [34] followed by two sequential equilibration simulations using canonical (NVT) and isobaric-isothermic (NPT) ensemble for 100 picoseconds (ps) each. Using the NPT group, final MD simulation was carried out and long-range electrostatic interactions were identified by using particle mesh Ewald (PME) method [35]. The MD calculations were carried out at 300 K temperature and 1 bar pressure using the GROMACS 2020.1 simulation package, and the resulting data were analyzed [36, 37].