Introduction

Therapeutic vaccination of human immunodeficiency virus-1 (HIV-1)-infected individuals is a new promising approach for the prevention of disease progression to AIDS (Fomsgaard et al. 2011). The aim of targeted therapeutic vaccination against HIV-1 is to redirect and induce more efficient and broader immune responses towards conserved viral epitopes than the immunity induced by the infection itself (Fomsgaard et al. 2011; Karlsson et al. 2013). So far, several HIV-1 therapeutic vaccine candidates were unsuccessful in clinical trials mainly due to inefficient delivery system or inadequate immunogen design (Fomsgaard 2015; Mothe et al. 2019; Kardani et al. 2020). One of the most promising approaches for developing HIV-1 therapeutic vaccine candidate is to design and construct artificial multiepitope (mosaic) immunogens using the selection of a wide range of highly conserved, immunostimulatory and protective T-cell epitopes from the main viral antigens that can induce potent immune responses against HIV-1 infection (Kulkarni et al. 2014; Murakoshi et al. 2015; Karpenko et al. 2018). Therefore, bioinformatics tools can predict the highly immunogenic epitopes with high specificity in a short time for designing and developing a beneficial vaccine immunogen (Khairkhah et al. 2018; Kardani et al. 2020). Among different vaccine platforms, DNA vaccines are cost-effective and easy to produce (Khan 2013). A recombinant fusion DNA vaccine encoding various epitopes from multiple antigens can induce immune system against all of the epitopes and provide long-term immunogenicity (Kardani et al. 2020).

As known, the ~ 9 kb RNA genome of HIV-1 contains nine genes named as gag, pol, env, nef, vif, vpu, vpr, tat and rev that encode fifteen proteins such as major proteins (Gag, Pol & Env), accessory proteins (Nef, Vif, Vpu & Vpr), and regulatory proteins (Tat & Rev). Also, HIV-1 is composed of four groups (M, N, O & P) that group M is responsible for the universal HIV-1 pandemic. The HIV-1 group M is further subdivided into clades called as subtypes (i.e., A, B, C, D, F, G, H, J and K strains) (Ng’uni et al. 2020). The Gag, Pol, Env and Nef are the major proteins expressed during viral infection. Also, Gag, Pol and Nef are potential targets of the CD8+-related immune response and Env is a major target of humoral and cellular immunity (Kong et al. 2003). Vaccine candidates containing the conserved regions of Gag, Pol, Env and Nef proteins could induce high-levels of specific CD8+ and CD4+ T cells, and IFN-γ response in different clinical trials (Korber and Fischer 2020; Ng’uni et al. 2020). Furthermore, the functional regions of Rev protein are relatively conserved among different HIV-1 subtypes. Rev and Nef proteins are frequent targets of cytotoxic T lymphocytes (CTLs) that have a negative effect on HIV-1 disease progression (Yu et al. 2005).

One important disadvantage of DNA vaccines is their poor immunogenicity, thus we used immunogenic epitopes derived from human heat shock protein 70 (Hsp70) as a promising immunostimulatory agent. HSP proteins are capable of promoting antigen presentation of chaperoned peptides (multiple antigenic epitopes that are linked covalently to a single HSP protein) through interaction with APCs receptors (Krupka et al. 2015). Moreover, Hsp70 could activate and regulate innate and adaptive immunity, and mediate the stimulation of dendritic cells (DCs) for secreting proinflammatory cytokines as well as the expression of costimulatory molecules for inducing strong CD8+ T cell responses (Krupka et al. 2015; Hwang et al. 2019). Many researches indicated that not only the full-length Hsp70 protein could stimulate the proliferation and activation of natural killer cells (NKs) and DCs, but also the epitopes derived from this protein were more effective for stimulation of the immune system with higher immunostimulatory and adjuvant properties (Multhoff et al. 2001; Krupka et al. 2015; Li et al. 2016).

In this study, two multiepitope HIV-1 immunogens based on the conserved T-cell epitopes derived from five HIV-1 proteins (i.e., Gag, Pol, Env, Nef & Rev) and Hsp70 were designed using bioinformatics prediction tools. The selected epitopes could theoretically bind to the most common HLAs I and HLAs II predominant in the world and also HLA-I supertypes (HLA-A*01:01, HLA-A*02:01, HLA-A*03:01, HLA-A*24:02, HLA-A*26:01, HLA-B*07:02, HLA-B*08:01, HLA-B*15:01, HLA-B*27:05, HLA-B*39:01, HLA-B*40:01 and HLA-B*58:01), and induce both cytotoxic (CD8+ CTL) and helper (CD4+ Th) T-cell responses. Then, the designed multiepitope DNA constructs encoding gag-pol-env-nef-rev and hsp70-gag-pol-env-nef-rev genes were used for in vitro transfection. Finally, their expression was evaluated by fluorescent microscopy, flow cytometry and western blotting.

Materials and methods

Protein sequence retrieval

The reference protein sequences of standard HIV-1 species 11,676 from group M including Gag-Pol (UniProtKB-P04585), Env (UniProtKB-P04578), Nef (UniProtKB-P04601), Rev (UniProtKB-P04618), and human heat shock 70 kDa protein (Hsp70) 1A (UniProtKB-P0DMV8) were retrieved from the Uniprot database (www.uniprot.com) and used as an input for epitope prediction by bioinformatics tools.

MHC-I and MHC-II binding epitope prediction

NetMHCpan 4.0 (http://www.cbs.dtu.dk/services/NetMHCpan/) was used to predict binding of peptides (8–11 amino acids) in linear form to MHC class I groove using Artificial Neural Networks (ANNs). Threshold for binding affinity of peptide-MHC-I (percentile rank) was set at 0.5% for strong binders and 2% for weak binders. In addition, NetMHCIIpan 3.2 (http://www.cbs.dtu.dk/services/NetMHCIIpan/) was used to predict binding of peptides (14–16 amino acids) in linear form to MHC class II groove using ANNs. Threshold (percentile rank) for strong and weak binders was set at 2% and 10%, respectively. Moreover, IEDB MHC-I (http://tools.iedb.org/mhci/) and MHC-II (http://tools.iedb.org/mhcii/) binding prediction tools were used to predict the ability of peptides in linear form for binding MHC class I and MHC class II grooves. IEDB recommended method was applied to estimate percentile ranks for predicted peptide-MHC complexes. Furthermore, in this study, all protein sequences were analyzed separately. Peptides derived from HIV-1 proteins and Hsp70 were assessed for binding to human HLA-I supertypes, HLAs-I and II predominant worldwide, HLA alleles which have 5% or more frequency in Iran, and mouse MHC-I and MHC-II alleles in both NetMHCpan and IEDB MHC binding prediction tools.

MHC-I processing prediction

The selected MHC-I peptides with the best binding ranks to different HLAs were used to estimate antigen processing through the antigen presentation pathway. The proteasomal cleavage and transporter associated with antigen processing (TAP) transport efficacy analyses were carried out using Proteasomal cleavage/TAP transport/MHC class I combined predictor from IEDB database (http://tools.iedb.org/processing/). The immunoproteasome option was applied for this study. Proteasomal processing, MHC-I binding and TAP transport efficacy are three main steps of the MHC-I antigen presentation pathway in IEDB combined predictor that estimates a total processing score for each epitope.

MHC-I immunogenicity prediction

The immunogenicity of all the selected MHC-I peptides was estimated by the IEDB Class I Immunogenicity tool (http://tools.iedb.org/immunogenicity/). This tool uses the properties and locations of amino acids to predict the peptide-MHC complex immunogenicity. Default parameters were applied for this server.

Population coverage and conservancy analysis

The population coverage percentage of each peptide was determined using IEDB population coverage tool (http://tools.immuneepitope.org/tools/population/iedb_input). Herein, the HLA-I and II alleles which bind to each predicted peptide were entered as inputs for population coverage analyses. Furthermore, the epitope conservancy analysis tool at IEDB web server (http://tools.immuneepitope.org/tools/conservancy/iedb_input) was used for assessing the identity of a given peptide sequence among different HIV-1 subtypes in group M to predict the conserved cross-reactive epitopes.

Allergenicity analysis

Potential allergenicity of the selected epitopes was estimated by AllergenFP v.1.0 web server (https://ddg-pharmfac.net/AllergenFP/).

Toxicity and hemotoxicity analysis

Potential toxicity and hemotoxicity of the selected epitopes for the host were estimated by ToxinPred (https://webs.iiitd.edu.in/raghava/toxinpred/) and HemoPI (https://webs.iiitd.edu.in/raghava/hemopi/) web servers. Default parameters for both analyses were applied.

Prediction of cytokine production

To predict the ability of the selected HTL epitopes to induce cytokines including Interleukin (IL)-10, IL-4 and interferon-gamma (IFN-γ), IL10Pred (https://webs.iiitd.edu.in/raghava/il10pred/), IL4Pred (https://webs.iiitd.edu.in/raghava/il4pred/) and IFNepitope (https://webs.iiitd.edu.in/raghava/ifnepitopeas/) servers were used, respectively. Default parameters were applied for these analyses.

Peptide-protein flexible molecular docking analysis

GalaxyPepDock peptide-protein flexible docking server (http://galaxy.seoklab.org/cgi-bin/submit.cgi?type=PEPDOCK) was used to predict binding interaction of the selected peptides with MHC alleles and estimate the formation of MHC-peptide complexes. Peptide-protein flexible molecular docking analysis was performed separately, for each selected epitope with human and mouse MHC alleles. Additionally, the PDB files of MHC alleles were obtained from RCSB database (https://www.rcsb.org). The PDB IDs were 1OGT, 5HGA, 4UQ3, 3RL2, 3LKN, 1X7Q, 3SPV & 5EO1 for HLA-B27:05, HLA-A24:02, HLA-A02:01, HLA-A03:01, HLA-B35:01, HLA-A11:01, HLA-B08:01 & HLA-B07:02 alleles, respectively; and 4AH2, 2Q6W, 5LAX, 6CPL & 1FV for HLA-DRB1:0101, HLA-DRB1:0301, HLA-DRB1:0401, HLA-DRB1:1101 & HLA-DRB5:0101 HLA alleles, respectively. MHC alleles expressed by commonly used inbred mouse strains including H-2-Ld (PDB ID: 1LDP), H-2Kd (PDB ID: 5GSV), H-2-Dd (PDB ID:5IVX) and H-2-IAd (PDB ID: 2IAD) in BALB/c, H-2-Db (PDB ID: 1JUF), H-2-Kb (PDB ID:4PV9) and H-2-IAb (PDB ID: 4P23) in C57BL/6, and H-2-Ag7 (PDB ID: 1ESO) in NOD mice were used in this study for peptide-mouse MHC docking analyses.

Design of multiepitope peptide constructs

To design multiepitope peptide constructs (Gag-Pol-Env-Nef-Rev and Hsp70-Gag-Pol-Env-Nef-Rev), the predicted CTL and HTL epitopes with higher scores from in silico analyses were linked in tandem and the AAY proteolytic linker was used among epitopes to form fusion constructs.

Physicochemical features of the designed constructs

The physicochemical features of the designed constructs such as molecular weight, negatively and positively charged residues and theoretical pI were calculated by ProtParam tools (https://web.expasy org/protparam/). Furthermore, Protein-Sol web server (https://protein-sol.manchester.ac.uk/) was used for prediction of construct solubility.

Secondary structure prediction

Secondary structures of the multiepitope vaccine constructs were predicted by RaptorX (http://raptorx2.uchicago.edu/StructurePropertyPred/predict/), and PSIPRED 4.0 (http://bioinf.cs.ucl.ac.uk/psipred/) tools. PSIPRED 4.0 is a free prediction tool that utilizes a stringent cross approval strategy and achieves a normal Q3 score of 81.6% (Khatoon et al. 2017).

Tertiary structure prediction

To predict the tertiary structure of the multiepitope constructs, Iterative Threading ASSEmbly Refinement (I-TASSER) web server (https://zhanglab.ccmb.med.umich.edu/I-TASSER/) was applied. The tertiary structure of a protein determines its biological function. I-TASSER is a hierarchical strategy to predict protein structure and structure-based function annotation. Moreover, the I-TASSER builds 3D atomic models from different stringing arrangements and iterative structural assembly simulations according to amino acid sequences (Yang and Zhang 2015).

Refinement and model quality of tertiary structure

To refine the predicted tertiary structures, GalaxyRefine server (http://galaxy.seoklab.org/cgi-bin/submit.cgi?type=REFINE) was applied. This approach first rebuilds side chains and performs side-chain repacking and subsequent overall structure relaxation through molecular dynamics simulation. Moreover, to check quality of the predicted tertiary structures, the final refined models were evaluated by ERRAT web server (https://servicesn.mbi.ucla.edu/ERRAT). ERRAT web server was applied to calculate the overall quality factor (OQF) for non-bonded atomic interactions. Generally, OQF above 50% for any structure is considered as a high-quality model (Colovos and Yeates 1993).

B-cell epitope prediction

IEDB Bepipred linear epitope prediction tool (http://tools.iedb.org/bcell/) was used to predict B-cell epitopes of the Gag-Pol-Env-Nef-Rev and Hsp70-Gag-Pol-Env-Nef-Rev multiepitope constructs by default thresholds (Larsen et al. 2006).

Protein–protein docking between toll-like receptors and multiepitope constructs

The toll-like receptors (TLR)-multiepitope construct docking process was done using ClusPro 2.0 (https://cluspro.bu.edu). In order to perform this protein–protein docking, final refined tertiary structures of the designed constructs were submitted as ligands for TLR-2, TLR-3, TLR-4, TLR-5, TLR-8 and TLR-9. The PDB files of TLRs (TLR-2 PDB ID: 2Z7X, TLR-3 PDB ID: 1ZIW, TLR-4 PDB ID: 3FXI, TLR-5 PDB ID: 3J0A, TLR-8 PDB ID: 3W3G, and TLR-9 PDB ID: 3WPB) extracted from RCSB database (https://www.rcsb.org). In addition, the docking results were visualized by ChimeraX-1.1 software.

Antigenicity and allergenicity prediction of the designed constructs

Antigenicity of the designed constructs was estimated by VaxiJen v2.0 server (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html). VaxiJen v2.0 predicts the antigenicity based on the physicochemical properties of the constructs with an alignment-independent algorithm. The allergenicity evaluation of the designed constructs was performed using AllergenFP v.1.0 web server (https://ddg-pharmfac.net/AllergenFP/).

Construction of pUC57-hsp70-gag-pol-env-nef-rev

The nucleotide sequence of hsp70-gag-pol-env-nef-rev was retrieved by amino acid reverse translation tool (http://www.bioinformatics.org/sms2/rev_trans.html), and the restriction enzyme sites were determined for cloning procedure. The pUC57 cloning vector harboring hsp70-gag-pol-env-nef-rev multiepitope DNA sequence (pUC57-hsp70-gag-pol-env-nef-rev) was prepared by Biomatik Corporation (Cambridge, Canada).

Preparation of eukaryotic expression plasmids harboring multiepitope DNA constructs

For the generation of eukaryotic expression plasmids (pEGFP-N1-gag-pol-env-nef-rev and pEGFP-N1-hsp70-gag-pol-env-nef-rev), the gag-pol-env-nef-rev and hsp70-gag-pol-env-nef-rev genes (without stop codon) were subcloned from pUC57-hsp70-gag-pol-env-nef-rev into the BglII/HindIII and XhoI/HindIII cloning sites of pEGFP-N1 expression vector (Clontech), respectively. The pEGFP-N1-gag-pol-env-nef-rev and pEGFP-N1-hsp70-gag-pol-env-nef-rev constructs were purified using plasmid extraction Mini-kit (Yekta Tajhiz Azma, Iran). The concentration and purity of DNA were determined by NanoDrop spectrophotometer. The presence of the inserted gag-pol-env-nef-rev and hsp70-gag-pol-env-nef-rev fragments was confirmed by restriction enzyme digestion and detection on agarose gel electrophoresis.

In vitro expression of the designed DNA constructs in HEK-293 T cells

The HEK-293 T cells (Pasteur Institute of Iran) were maintained in complete DMEM (Sigma) medium supplemented with 10% fetal bovine serum (FBS, Gibco), pen/strep (100U/ml penicillin and 0.1 mg Streptomycin; Gibco) at 37 °C and 5% CO2 atmosphere. Then, about 5 × 104 cells were seeded into each well of a 24-well plate and transfected using Lipofectamine 2000 (cationic lipid, invitrogen), as in vitro transfection reagent. Lipofectamine/ DNA complexes were generated by mixing 2 μl of Lipofectamine with 1 μg of each pEGFP-N1-gag-pol-env-nef-rev or pEGFP-N1-hsp70-gag-pol-env-nef-rev. Lipofectamine /DNA complexes were incubated at room temperature for 30 min and added to HEK-293 T cells in serum-free media. The untransfected HEK-293 T cells as a negative control, and the HEK-293 T cells transfected with pEGFP-N1 as a positive control were included in our experiments. The medium was replaced after 5 h incubation at 37 °C with DMEM containing 5% FBS. The cells were harvested 48 h post-transfection, washed, and resuspended in PBS, to determine the proportion of fluorescent cells expressing hsp70-gag-pol-env-nef-rev and gag-pol-env-nef-rev genes using flow cytometry. The experiment was performed in duplicate and the results were shown as mean ± standard deviation (SD). The expression of multiepitope peptide constructs was also detected using fluorescent microscopy as well as western blot analysis using an anti-GFP antibody.

Results

CTL epitope prediction

The predicted 8 to 11-mer peptides of each HIV-1 protein (Gag, Pol, Env, Nef, Rev), and Hsp70 with high binding affinity scores for the highest number of HLA-I alleles were determined as high-potential CTL epitopes. Then, by adding one or more amino acids to each peptide and design of a longer epitope, HLA binding and population coverage were improved. Next, binding affinity of the selected epitope to mouse MHC-I alleles was also predicted. After that, the epitope candidates were screened based on their calculated MHC-I processing average scores, immunogenicity scores, and the population coverage among different geographic regions in the world. The top scoring epitopes were assessed for the degree of conservancy among different HIV-1 clades, allergenicity, toxicity and hemotoxicity. Two CTL epitopes derived from each HIV-1 protein and Hsp70 which had better results in the integrated in silico analyses were selected as final epitope candidates for construct design. The Gag (75–85), Gag (272–282), Pol (710–721), Pol (742–754), Env (167–178), Env (673–684), Nef (63–75), Nef (133–144), Rev (53–63), Rev (62–74), Hsp70(113–126), Hsp70(285–298) epitopes are the putative selected CTL epitopes. Human MHC-I binding affinity scores in NetMHCpan 4.0, and percentile ranks in IEDB servers for the selected CTL epitopes of HIV-1 proteins and Hsp70 were shown in Table 1. Also, binding prediction scores of putative HIV-1 proteins and Hsp70 CTL epitopes for mouse alleles were indicated in Table 2.

Table 1 Epitope binding prediction of putative HIV-1 proteins and Hsp70 CTL epitopes for human MHC-I alleles
Table 2 Epitope binding prediction of putative HIV-1 proteins and Hsp70 CTL and HTL epitopes for mouse alleles

HTL epitope prediction

The predicted 14 to 16-mer peptides of each HIV-1 protein (Gag, Pol, Env, Nef, Rev), and Hsp70 with high binding affinity scores for the highest number of HLA-II alleles were determined as high-potential HTL epitopes. The binding affinity of the selected epitopes to mouse MHC-II alleles was also predicted. Then, HTL epitope candidates were screened based on their calculated population coverage among different geographic regions in the world. Next, the selected epitopes with the best scores were assessed for the degree of conservancy among different HIV-1 clades, allergenicity, toxicity, hemotoxicity and ability of cytokines production. Based on the results obtained from the integrated analyses, two HTL epitopes from each Gag, Pol, Env, Hsp70 protein and one peptide from each Nef and Rev protein were selected and used for final multiepitope construct design. The Gag (263–276), Gag (270–283), Pol (1116–1129), Pol (1362–1375), Env (249–262), Env (744–757), Nef (183–197), Rev (11–24), Hsp70 (168–182), Hsp70(389–403) epitopes are the putative selected HTL epitopes. Human MHC-II binding affinity scores in NetMHCIIpan 3.2, and percentile ranks in IEDB servers for the selected HTL epitopes of HIV-1 proteins and Hsp70 were shown in Table 3. In addition, the binding prediction scores of putative HIV-1 proteins and Hsp70 HTL epitopes for mouse alleles were indicated in Table 2.

Table 3 Epitope binding prediction of putative HIV-1 proteins and Hsp70 HTL epitopes for human MHC-II alleles

MHC-I processing and immunogenicity of CTL epitopes

T-cell epitope processing and immunogenicity scores of the selected MHC-I epitopes were indicated in Table 4. Median proteasome score, TAP score, processing score (proteasome and TAP score), and total score (Proteasome, TAP and MHC score) for each selected epitope were shown in Table 4. The Gag (75–85), Gag (272–282), Pol (710–721), Pol (742–754), Env (167–178), Env (673–684), Nef (63–75), Nef (133–144), Rev (53–63), Rev (62–74), Hsp70(113–126), Hsp70(285–298) epitopes had the highest processing scores indicating the great efficiency of proteasomal cleavage and tap transport. Also, the selected epitopes had the highest immunogenicity scores in the IEDB immunogenicity predictor analysis.

Table 4 MHC-I processing prediction and immunogenicity scores of putative HIV-1 proteins and Hsp70 CTL epitopes

Population coverage and conservancy of CTL and HTL epitopes

Population coverage and conservancy results for each selected CTL and HTL epitope were shown in Tables 5 and 6, respectively. The highest population coverage among different geographic regions worldwide and the highest conservancy percentage were found for all potential HTL and CTL epitopes.

Table 5 Population coverage and conservancy of putative HIV-1 proteins and Hsp70 CTL epitopes
Table 6 Population coverage and conservancy of putative HIV-1 proteins and Hsp70 HTL epitopes

Allergenicity, toxicity and hemotoxicity of CTL and HTL epitopes

All the selected HTL epitopes and the majority of the selected CTL epitopes were non-allergenic in allergenicity analysis. The selected Gag(75–85), Gag(272–282), Env(167–178) and Nef(133–144) CTL epitopes were estimated as allergen by AllergenFP v.1.0 allergenicity analysis. Also, Toxicity analysis indicated that none of the selected epitopes were toxic. Probability score in hemotoxicity analysis is the normalized SVM score ranges between 0 and 1, i.e. Score 1 is very likely to be hemolytic and score 0 is very unlikely to be hemolytic. All of the CTL and HTL selected epitopes had the probability score about 0.5 in hemotoxicity analysis.

Cytokine production of HTL epitopes

Cytokine production ability of the selected HTL epitopes to induce IL-10, IL-4 and IFN-γ were shown in Table 7. The SVM threshold for prediction of IL-10 and IL-4 were − 0.3 and 0.2, respectively. The selected epitopes with scores more than threshold were regarded as IL-10 and IL-4 inducers. Also, inducer epitopes of IFN-γ had positive IFN-Production SVM scores.

Table 7 Cytokine production of HIV-1 proteins and Hsp70 HTL epitopes

Peptide-protein molecular docking between the selected epitopes and MHC molecules

Top models that had the highest peptide-protein interaction similarity scores between each epitope and both human and mouse MHC-I and II molecules were selected as listed in Tables 8 and 9. Furthermore, Figs. 1 and 2 indicated the examples of successful peptide-protein docking between the selected epitopes and human and mouse MHC molecules.

Table 8 Interaction similarity scores of the selected putative HIV-1 proteins and Hsp70 CTL epitopes for human and mouse MHC-I using GalexyPepDock flexible docking server
Table 9 Interaction similarity scores of the selected HIV-1 proteins and Hsp70 HTL epitopes for human and mouse MHC-II using GalaxyPepDock flexible docking server
Fig. 1
figure 1

Molecular docking between CTL epitopes and human (a, b) and mouse (c, d) MHC class I alleles: a Successful peptide-protein docking between Nef 133–144 and HLA A2402 with interaction similarity score of 381.0; b Successful peptide-protein docking between Pol 710–721 and HLA B0801 with interaction similarity score of 268.0; c Successful peptide-protein docking between Hsp70 113–126 and H-2-Db with interaction similarity score of 344.0; d Successful peptide-protein docking between Nef 63–75 and H-2-Ld with interaction similarity score of 334.0

Fig. 2
figure 2

Molecular docking between HTL epitopes and human (a, b) and mouse (c, d) MHC class II alleles: a Successful peptide-protein docking between Gag 263–276 and HLA-DRB1:0401 with interaction similarity score of 145.0; b Successful peptide-protein docking between Nef 183–197 and HLA DRB1:0301 with interaction similarity score of 184.0; c Successful peptide-protein docking between Hsp70 389–403 and HLA DRB1:0101 with interaction similarity scores of 158.0; d Successful peptide-protein docking between Gag 270–283 and H-2-IAd with interaction similarity scores of 125.0

Design of multiepitope constructs

High score predicted CTL epitopes derived from Hsp70, Gag, Pol, Env, Nef, and Rev proteins (two epitopes from each protein), and also high score predicted HTL epitopes derived from Hsp70, Gag, Pol, Env (two epitopes from each protein), Nef and Rev (one epitope from each protein) were constructed in tandem (22 epitopes in total). In addition, AAY linker was used as a proteasomal cleavage sequence between these peptides to improve processing. The methionine amino acid and histidine tag were added in the N- and C-terminal regions of both constructs, respectively. A schematic diagram of the multiepitope Gag-Pol-Env-Nef-Rev and Hsp70-Gag-Pol-Env-Nef-Rev constructs was shown in Supplementary Figs. 1 and 2, respectively.

Physicochemical properties and protein solubility of the designed constructs

The obtained results of physicochemical properties for each designed construct from ProtParam and Protein-Sol database were summarized in Table 10. In protein solubility analysis, a scaled protein solubility value greater than 0.45 was predicted to possess a higher solubility than the average soluble E. coli protein from the experimental solubility dataset. In contrast, any protein with a lower scaled solubility value was predicted to be less soluble. The predicted solubility of Gag-Pol-Env-Nef-Rev and Hsp70-Gag-Pol-Env-Nef-Rev constructs was estimated as 0.413 and 0.366, respectively.

Table 10 Physicochemical properties of the designed constructs

Secondary structure prediction of the designed constructs

Prediction of the secondary structure of two multiepitope peptide constructs was performed by RaptorX and PSIPRED 4.0. The predicted Hsp70-Gag-Pol-Env-Nef-Rev structure was composed of 45% alpha-helix, 21% β-sheet, and 32% coil regions. Also, the obtained results from solvent accessibility prediction of residues indicated that 28% of residues are exposed, 31% of residues are medium exposed, and 39% of them are buried. Furthermore, 1% of positions were predicted as disordered. Meanwhile, the predicted Gag-Pol-Env-Nef-Rev structure was composed of 50% alpha-helix, 16% β-sheet, and 33% coil regions. The results of solvent accessibility prediction of residues determined that 36% of residues are exposed, 29% of residues are medium exposed, and 33% of amino acids are buried. Moreover, 2% of positions were predicted as disordered. The predicted secondary structures of two multiepitope peptide constructs by PSIPRED 4.0 were illustrated in Fig. 3.

Fig. 3
figure 3

Predicted secondary structures of two multiepitope constructs by PSIPRED 4.0: a Sequence plot of Hsp70-Gag-Pol-Env-Nef-Rev; b Sequence plot of Gag-Pol-Env-Nef-Rev; c Secondary structure of HSP70-Gag-Pol-Env-Nef-Rev; d Secondary structure of Gag-Pol-Env-Nef-Rev

Tertiary structure prediction of the designed constructs

I-TASSER web server predicted five models for tertiary structure. The confidence of each model was estimated by C-score that is a value to indicate the accuracy of the predicted models. The C-scores of Hsp70-Gag-Pol-Env-Nef-Rev and Gag-Pol-Env-Nef-Rev multiepitope peptide constructs were − 0.89 and − 3.02, respectively. Figure 4 illustrates the predicted tertiary structures of Hsp70-Gag-Pol-Env-Nef-Rev and Gag-Pol-Env-Nef-Rev multiepitope constructs.

Fig. 4
figure 4

Predicted 3D structures of two multiepitope constructs by I-TASSER: a 3D structure of Hsp70-Gag-Pol-Env-Nef-Rev; and b 3D structure of Gag-Pol-Env-Nef-Rev

Refinement and model quality of tertiary structure

The top model of predicted tertiary structures for each construct was submitted to GalaxyRefine web server to refine the models. Then, the final refined models were subjected to ERRAT web server for checking the 3D structures as shown in Fig. 5. The overall quality factor predicted by the ERRAT web server was 96.1538, and 75.8741 for Hsp70-Gag-Pol-Env-Nef-Rev and Gag-Pol-Env-Nef-Rev constructs, respectively.

Fig. 5
figure 5

Model quality of the 3D structures of two multiepitope peptide constructs: a Refined model of Hsp70-Gag-Pol-Env-Nef-Rev; b Refined model of Gag-Pol-Env-Nef-Rev; c The ERRAT plot of the Hsp70-Gag-Pol-Env-Nef-Rev construct 3D model, d The ERRAT plot of the Gag-Pol-Env-Nef-Rev construct 3D model

B-cell epitope prediction of the designed constructs

The sequence and position of the predicted B-cell epitopes for Hsp70-Gag-Pol-Env-Nef-Rev and Gag-Pol-Env-Nef-Rev constructs were shown in Table 11. Moreover, B-cell epitopes in two designed constructs were predicted by online server as indicated in Fig. 6a, b. The predicted B cell epitopes were indicated in the 3D structure of each construct, as well (Fig. 6c, d).

Table 11 B-cell epitopes predicted for the designed constructs
Fig. 6
figure 6

Linear B-cell epitopes prediction of the designed constructs using BepiPred: a Predicted linear B-cell epitopes of Hsp70-Gag-Pol-Env-Nef-Rev; b Predicted linear B-cell epitopes of Gag-Pol-Env-Nef-Rev; c Illustrated linear B-cell epitopes of Hsp70-Gag-Pol-Env-Nef-Rev on 3D structure as shown in red color; d Illustrated linear B-cell epitopes of Gag-Pol-Env-Nef-Rev on 3D structure as shown in red color

Protein–protein docking between TLRs and multiepitope peptide constructs

Protein–protein docking between TLRs and multiepitope peptide constructs was performed using ClusPro 2.0, and 30 models were built for each docking. Among them, we selected the models which properly occupied the receptor and had the lowest energy scores. The lowest energy level achieved for docking between Hsp70-Gag-Pol-Env-Nef-Rev construct and TLR-2, TLR-3, TLR-4, TLR-5, TLR-8 or TLR-9 were − 1226.7, − 1310.2, − 1741.2, − 1652.6, − 1392.6 and − 1253.3 respectively, as shown in Fig. 7. The lowest energy level achieved for docking between Gag-Pol-Env-Nef-Rev construct and TLR-2, TLR-3, TLR-4, TLR-5, TLR-8 and TLR-9 were − 1291.6, − 1486, − 1408.8, − 1829, − 1296.5 and − 1387, respectively, as shown in Fig. 8. The lowest energy levels indicated the highest binding affinity between multiepitope peptide constructs and TLRs in docked complexes.

Fig. 7
figure 7

The protein–protein docking between Hsp70-Gag-Pol-Env-Nef-Rev multiepitope construct and TLRs: a Interaction of the multiepitope construct and TLR-2; b Interaction of the multiepitope construct and TLR-3; c Interaction of the multiepitope construct and TLR-4; d Interaction of the multiepitope construct and TLR-5; e Interaction of the multiepitope construct and TLR-8; f Interaction of the multiepitope construct and TLR-9; The multiepitope construct and TLRs were shown as colored ribbon and golden ribbon representation, respectively

Fig. 8
figure 8

The protein–protein docking between Gag-Pol-Env-Nef-Rev multiepitope construct and TLRs: a Interaction of the multiepitope construct and TLR-2; b Interaction of the multiepitope construct and TLR-3; c Interaction of the multiepitope construct and TLR-4; d Interaction of the multiepitope construct and TLR-5; e Interaction of the multiepitope construct and TLR-8; f Interaction of the multiepitope construct and TLR-9; The multiepitope construct and TLRs were shown as colored ribbon and golden ribbon representation, respectively

Antigenicity and allergenicity of the designed constructs

The antigenicity and allergenicity prediction results revealed that both multiepitope constructs had good antigenic and non-allergic nature. The threshold for the antigenicity prediction was set as 0.4, and both Hsp70-Gag-Pol-Env-Nef-Rev and Gag-Pol-Env-Nef-Rev constructs had antigenicity prediction scores above the threshold.

Confirmation of the DNA constructs

The codon optimized DNA sequence of the designed construct (hsp70-gag-pol-env-nef-rev) for the E. coli was obtained from amino acid reverse translation tool. The EcoRI, BglII, BamHI, XhoI and HindIII cut sites were considered in the hsp70-gag-pol-env-nef-rev DNA construct. A schematic diagram of multiepitope DNA construct was shown in Fig. 9. The gag-pol-env-nef-rev and hsp70-gag-pol-env-nef-rev fragments were then subcloned from pUC57-hsp70-gag-pol-env-nef-rev into pEGFP-N1 vector. The electrophoresis results after digestion of pEGFP-N1-gag-pol-env-nef-rev and pEGFP-N1-hsp70-gag-pol-env-nef-rev constructs by the BglII/ HindIII and XhoI/ HindIII restriction enzymes showed clear bands of ~ 888 and ~ 1122 bp related to gag-pol-env-nef-rev and hsp70-gag-pol-env-nef-rev genes, respectively as shown in Supplementary Fig. 3.

Fig. 9
figure 9

The designed multiepitope Hsp70-Gag-Pol-Env-Nef-Rev DNA construct

In vitro expression of the DNA constructs in HEK-293 T cells

In vitro transfection of the pEGFP-N1-gag-pol-env-nef-rev and pEGFP-N1-hsp70-gag-pol-env-nef-rev into HEK-293 T cells was confirmed by fluorescence microscopy, flow cytometry and western blotting (Fig. 10 and Supplementary Fig. 4). The transfected cells were appeared as green cells in fluorescence microscopy. The percentage of gag-pol-env-nef-rev-gfp and hsp70-gag-pol-env-nef-rev-gfp genes expressing cells was 56.95% ± 1.42 and 60.39% ± 0.55, respectively. The percentage of GFP expression in the cells transfected by pEGFP-N1 (positive control) was 77.50% ± 2.93. Expression of Gag-Pol-Env-Nef-Rev-GFP, Hsp70-Gag-Pol-Env-Nef-Rev-GFP and GFP in the transfected cells was confirmed by western blot analysis as the clear bands of ~ 63, ~ 72 and ~ 27 kDa, respectively. No band was observed in untreated cells.

Fig. 10
figure 10

The transfection efficiency of pEGFP-N1-gag-pol-env-nef-rev and pEGFP-N1-hsp70-gag-pol-env-nef-rev using flow cytometry and fluorescent microscopy: a Untransfected HEK-293 T cells as negative control (~ 0.5%), b Transfected HEK-293 T cells with pEGFP-N1-hsp70-gag-pol-env-nef-rev (~ 60.39%), c Transfected HEK-293 T cells with pEGFP-N1-gag-pol-env-nef-rev (~ 56.95%), d Transfected HEK-293 T cells with pEGFP-N1 as positive control (~ 77.50%)

Discussion

An effective vaccine can stimulate the HIV-1-specific immune responses based on their ability to enhance CTL and Th cell activities. As known, CTL-mediated responses play a critical role in controlling virus infection. In addition, Th-cell mediated immunity is important to promote a functional CD8+ CTL response and prolong antibody immune responses leading to protection and viral load reduction (Abdulla et al. 2019; Lopez Angel and Tomaras 2020). Development of new immunoinformatics tools for analysis of HIV-1 proteins and identifying their poly-functional T-cell epitopes, especially when used in combination with in vivo analyses, can improve the immunogen design for HIV-1 vaccines (Khairkhah et al. 2018). Among various vaccine strategies, multiepitope DNA vaccine encoding the conserved T-cell epitopes is an appropriate approach for therapeutic vaccine design (Milani et al. 2020). Because human immune system responses are multi-specific and broad (recognize several proteins originated from one pathogen and various epitopes from one antigen), candidate epitopes can be selected from multiple viral proteins to form a single immunogen construct (Kardani et al. 2020). Since epitope binding to MHC molecules is a critical step for antigen presentation to T-cells, in the present study, two MHC-peptide binding predictors were simultaneously used to predict top-scoring T-cell epitopes. We considered that each selected peptide should bind to the highest number of MHC alleles with high binding affinity. Epitopes with the highest binding affinity for several MHC molecules and high immunogenicity and conservancy scores were selected as the most potent epitopes for vaccine design. For example, the selected Pol (742–754) epitope was bound to the 16 most common HLA-I alleles including five HLA supertypes such as HLA-A0301, HLA-A2402, HLA-B3901, HLA-B1501, HLA-B0702 and HLA-B0801 with an average IEDB percentile rank about 0.28. Also, molecular docking analysis for Pol (742–754) epitope showed the best interaction similarity scores between this epitope and HLA-B: 2705, HLA-A: 2402, HLA-A: 0201, HLA-A: 0301, HLA-B: 3501, HLA-A: 1101, HLA-B: 0801 and HLA-B: 0702 alleles. Vaccine candidates were usually tested in mouse models, thus in the current study, the binding affinity of the predicted epitopes to mouse MHC-I and II molecules was analyzed by two binding predictors and molecular docking analyses, as well. For example, the best binding affinity of Pol (742–754) epitope with mouse MHC-I alleles and the highest interaction similarity score in molecular docking analysis were predicted for mouse H-2-Ld allele.

The Percentile Rank in IEDB MHC-I binding prediction method is generated by comparing the peptide half-maximal inhibitory concentration (IC50) against a group of random peptides from Swiss-Prot database. The small numbered IC50 or PR means higher binding affinity (Vita et al. 2015). Because IC50 value less than 500 nM for an epitope was associated with about 90% immunogenicity (Fleri et al. 2017), selecting potential binders with IC50 value between 50 and 500 nM in our study showed that almost all our predicted epitopes were the potential immunogen. As the IEDB team recommends using the class I immunogenicity predictor to supplement and reduce candidate epitopes (Fleri et al. 2017), we also examined the immunogenicity of the candidate peptides based on their amino acid composition by IEDB immunogenicity predictor tools. As known, the effective antigen processing and appropriate presentation to the immune system is the essential condition to induce a potent CD8T cell response (Khairkhah et al. 2018). So, in the present study, we analyzed T cell antigen processing by IEDB combined predictor which combines MHC binding with other parts of the MHC class I cellular pathway, and increases the accuracy of class I epitope prediction significantly (Fleri et al. 2017). The processing score in this method predicts T-cell epitope candidates independent of MHC restriction and combines the proteasome scores and TAP scores. All of our selected MHC-I epitopes had positive scores in these prediction tools. The higher processing score shows a better outcome of antigen processing (Tenzer et al. 2005; Fleri et al. 2017).

Vaccine immunogens should overcome HIV-1 antigenic variability and also high diversity of HLA tissue types. In this study, long peptides containing different epitopes with multiple HLA binding specificities had an increased population coverage rate compared to single short epitopes. For example, the Nef (133–144) RYPLTFGWCYKL selected epitope in our study had an extra arginine amino acid at the beginning of its sequence compared to the Nef (134–144) YPLTFGWCYKL epitope sequence proposed by Niloofar Khairkhah et al. Their Nef (134–144) epitope could bind to nine HLA-I alleles with 49.38% population coverage in the world and 70.24% in Iran (Khairkhah et al. 2018). In contrast, our selected Nef (133–144) epitope, with an additional arginine in its sequence could bind to 13 HLA-I alleles with 86.54% and 89.34% population coverage in the world and Iran, respectively. Also, the multiepitope constructs had high percentage of cumulative population coverage especially in the regions with a high rate of HIV-1 prevalence. In our study, the epitope conservancy scores for the majority of final selected epitopes were more than 80% within M group HIV-1 subtypes. High conservancy between HIV-1 subtypes provides broader immunity and reduces the risk of virus immune evasion. None of these selected epitopes had toxicity and strong hemotoxicity potency. Furthermore, most of the selected epitopes were not allergenic.

It is important to assess the immune responses generated by epitopes for their rational selection in vaccine development. Because certain residues and motifs in an epitope are responsible for inducing a specific cytokine, the use of in silico cytokine predicting tools gives a general view about the capability of T-cell epitopes for inducing multiple cytokines in a very simple, fast and inexpensive manner comparing with in vitro and in vivo immunological tests (Dhanda et al. 2013; Nagpal et al. 2017). IFN-γ is the signature cytokine of both the adaptive and innate immunity with antiviral, immune regulatory and anti-tumor activities. IFN-γ secretion is the major arm of Th1 response and critical for the reduction of HIV-1 viral load (Cheng et al. 2017). IL-10 plays an important role in the balance between protective responses and immunopathology of infection (Brockman et al. 2009) and IL-4 is a well-known cytokine of Th2 response (Dhanda et al. 2013). In the current study, we assessed the ability of IFN-γ, IL-10 and IL-4 cytokine production for each selected HTL epitope. The majority of our HTL epitopes were the inducer of IL-10 based on their predicted SVM scores. Some reports suggested that IL-10 had anti-HIV activity by blocking the production of inflammatory cytokines (Weissman et al. 1994) and also additional studies showed that IL-10-secreting T cells reduced HIV replication in pregnant women (Bento et al. 2009) and elderly patients (Andrade et al. 2007). Thus, our IL10 inducer epitopes may possess a potential anti-HIV effect based on in silico studies. Besides, the IFNepitope server was used to predict IFN-γ inducer peptides from MHC class II binders. The data showed that about half of our HTL epitopes were IFN-γ inducer. The IFN-γ production is associated with HIV-specific T-cell immunogenicity and induction of Th1 responses (Sanou et al. 2012). Also, about half of our HTL epitopes were IL-4 inducer that is associated with induction of Th2 responses (Dhanda et al. 2013). These data showed that our selected HTL epitopes had the potential to induce both Th1 and Th2 responses in vivo.

For 3D modeling of the Gag-Pol-Env-Nef-Rev and Hsp70-Gag-Pol-Env-Nef-Rev constructs, I-TASSER server was used. The C-score for evaluation of 3D structure accuracy normally is in the range of − 5 to 2 and the greater value of C-score shows better quality of prediction (Namvar et al. 2020). Our data indicated that the accuracy of Hsp70-Gag-Pol-Env-Nef-Rev was greater than Gag-Pol-Env-Nef-Rev, and the quality of the predicted 3D structures was improved after refinement. Final Gag-Pol-Env-Nef-Rev and Hsp70-Gag-Pol-Env-Nef-Rev 3D models were selected as input for B-cell linear epitope prediction, and protein–protein docking analysis between TLRs and our designed constructs. TLRs had a principle role in activation of the innate immunity as they recognized pathogens and then, induced the adaptive immune system (Martinsen et al. 2020). For example, TLR2 and TLR4 recognize viral structural proteins and induce inflammatory cytokine production, and also TLR3 triggers HIV-1 mediated activation of dendritic cells (Abdulla et al. 2019). In the current study, in silico assay for interaction of Gag-Pol-Env-Nef-Rev and/or Hsp70-Gag-Pol-Env-Nef-Rev with TLR2, TLR3, TLR4, TLR5, TLR8, and TLR9 showed strong binding affinity with low energy scores. The results suggested that the designed constructs could stimulate TLRs and induce downstream pathways to produce pro-inflammatory cytokines against HIV infection. An efficient HIV-1 therapeutic vaccine should induce both humoral and cellular immune responses. By using bepipred linear epitope prediction tools, several B cell epitopes were recognized in both constructs which can induce humoral immune responses. Although in this study, we identified novel HIV-1 epitopes, some of our selected epitopes were similar to those in other studies. In the DALIA phase II trial, an HIV-1 therapeutic vaccine containing DCs loaded with five long peptides (Gag 17–35, Gag 253–284, Nef 66–97, Nef 116–145 and Pol 325–355) was applied to induce T-cell immune responses in healthy volunteers (Salmon-Céron et al. 2010). Our predicted CTL epitopes including Gag (263–276) and Gag (270–283) were present in the Gag 253–284 long peptide sequence in the DALIA vaccine construct. The data from DALIA phase II trial indicated that Gag 253–284 peptide with a large binding capacity to the 20 most common HLA-DRB1 alleles, is probably the most important HIV-1 region to be included in a therapeutic vaccine (Surenaud et al. 2019).

Our selected Gag (263–276) and Gag (270–283) epitopes could bind to the 14 most frequent HLA-DRB1 alleles with high binding affinity. Also, cumulative population coverage for these two epitopes was 82.34% in the world and cross-clade conservancy was more than 90%. Furthermore, one of the five Gag conserved epitopes in the tHIV-1consvX therapeutic vaccine was RMYSPTSI, which could suppress replication of circulating HIV-1 in vaccinated individuals (Murakoshi et al. 2018). This sequence was included in our selected CTL Gag epitope (Gag (270–283): LNKIVRMYSPTSIL). Mothe et al. identified the immunogenic epitopes of functional CD8+ T-cell regions obtained from more than 1000 HIV-1 infected individuals. These CD8+ T-cell regions were associated with low viral load in vaccinated individuals (Mothe et al. 2015). Our predicted Nef (63–75) CTL and Pol (1362–1375) HTL epitopes overlapped with an extensive part of the identified Nef (WLEAQEEEEVGFPVRPQV) and Pol (TKIQNFRVYYRDSRDPLW) regions in Mothe et al. study. Another predicted CTL epitope (Nef (133–144)) in our study was present in TCI (T-cell immunogen) artificial polyepitope designed by Karpenko et al. based on the Los Alamos HIV-1 Molecular Immunology Database and could induce both specific T-cell and antibody responses in vaccinated group (Karpenko et al. 2018). These findings indicated reliability and accuracy of our methods for prediction of valuable epitopes from HIV-1 Gag, Pol, Env, Nef and Rev proteins. On the other hand, the Hsp70 epitopes effectively bound to all prevalent MHC-I and MHC-II alleles were determined by an in silico approach. In Matsui et al. study, Hsp70-derived epitopes bound to HLA-A*24:02, A*02:01 and *A02:06 alleles were identified (Matsui et al. 2019). Some of these epitopes inducing potent immune responses had partial sequence identity with our selected Hsp70 (113–126), Hsp70 (168–182) and Hsp70 (389–403) epitopes. Also, Faure et al. identified two epitopes of Hsp70 (p391 and p393) that could induce an in vivo CTL response. Their results showed that human and mice, who may not respond to the whole Hsp70 protein, could trigger a CTL response against p391 and p393 epitopes (Faure et al. 2004). Our selected Hsp70 (389–403) QDLLLLDVAPLSLGL epitope could strongly bind to 11 HLA-DRB1 alleles, and cover both p391 and p393 sequences (LLDVAPLSL and LLLLDVAPL epitopes). It indicates that our selected HTL epitopes such as Hsp70 (389–403), had ability in binding to the most frequent human MHC-I allele and inducing CTL response, as well.

Generally, our data indicated that the designed Gag-Pol-Env-Nef-Rev and Hsp70-Gag-Pol-Env-Nef-Rev constructs based on the selected CTL and HTL epitopes were antigenic, stable, non-allergenic and non-toxic. Moreover, the AAY linker, as the cleavage site of proteasomes in mammalian cells, prevents the formation of ‘junctional epitopes’ and enhances the epitope presentation (Yang et al. 2015). Our designed multiepitope peptide constructs were then translated reversely to their nucleotide sequences for development of the DNA vaccine candidates. The results of DNA transfection in mammalian cells indicated that both gag-pol-env-nef-rev and hsp70-gag-pol-env-nef-rev gene constructs were successfully expressed in vitro. These constructs will be used to immunize animals in near future.

In conclusion, two novel multiepitope constructs were designed based on the conserved and highly immunogenic MHC class I and II epitopes derived from HIV-1 proteins and Hsp70. The designed multiepitope DNA constructs could be successfully expressed in mammalian cells. Although, our bioinformatics approaches for epitope identification showed the effectiveness of our method in rapid selection of potential epitopes, further studies are needed to assess immunological effects of the designed constructs for development of a DNA-based vaccine candidate.