Introduction

Coronavirus Disease 2019 (COVID-19), a 2019 pandemic, is caused by the SARS-CoV-2, a novel strain of coronavirus. Other respiratory illnesses such as SARS and MERS are also caused by coronavirus family of viruses (Poutanen 2018). These viruses cause mild to severe respiratory infections, affecting mammals and birds. Between December 2019 and April 2020, COVID-19 infection rose as a rapidly spreading pandemic, affecting 1,119,109 people in 181 different countries in the world (statistics till manuscript submission). While the total death toll increased beyond 58,955 people across different countries, recovery rates stood higher constituting more than 226,873 people (Johns Hopkins University 2020a, b; World Health Organization 2020).

Although highly contagious, only ~ 20% of the infected needed severe hospitalization. However, the extremely severe cases demand high medical attention and utilize major hospital resources from patient beds, intensive care units (ICU), respiratory support to ventilators. With limited availability of these healthcare resources, the condition turned unmanageable (Grasselli et al. 2020; Rosenbaum 2020).

While age and symptoms matter in COVID-19 patient care, they are not necessarily accurate predictors of disease severity and deaths. Molecular biomarkers like C-reactive protein, interleukin-6 (IL-6), eosinophils and chest X-ray/computed tomography (CT) findings like lung consolidation, lung lesions and fibrosis have shown significant associations with COVID-19 disease severity as evidenced by several studies (Huang et al. 2020; Yun et al. 2021; Broman et al. 2020; Hashemian et al. 2020).

Since only few of these markers are tested clinically in COVID-19 patients, a robust and high-throughput technique like AI-enabled multiomics can integrate and assist in rapid screening of patients for efficient stratification. This way, valuable information like disease severity can be acquired at a high pace (Kumar et al. 2020).

We at iNDX.Ai, as AI innovators in precision oncology, are also enduring to repurpose our platforms like iCore (image analytics) and iCE (Integrated Correlation Engine—AI-based multiomics investigation software) to COVID-19 investigation so as to achieve rapid utility of enormous patient data, powerful patient stratification and feature prediction. Due to the limited knowledge of the molecular aspects of COVID-19 infection so far, an AI-enabled comprehensive evaluation of host molecular pathways in SARS-CoV-2 replication could also facilitate a better understanding of the disease. This review article thus aims to address the combined benefits of molecular biomarkers and clinical findings in efficient COVID-19 patient screening and stratification.

Current status of COVID-19

Disease origin/progression mechanisms

The infection was first detected in Wuhan, Hubei province, China, and is believed to have spread from wildlife wet market. The disease is caused by a strain SARS-CoV-2, that is phylogenetically related to severe acute respiratory syndrome-like (SARS-like) bat viruses, raising a possibility of bats to be its primary hosts. The virus after its ingression into human system, incubates and replicates before infecting multiple cells causing serious symptoms like upper respiratory tract infection, fever, sore throat, myalgias, etc. However, some infected cohort are asymptomatic with internally aggravating and advancing disease. A list of all immediate and persistent symptoms are presented in Table 1.

Table 1 Table presents immediate and persistent symptoms of COVID-19 infection

With its rapid spread, COVID-19 has invaded more than a 100 countries across the world with various degrees of illnesses and fatalities. Regarded as pandemic by the World Health Organization (WHO), infections started overwhelming hospitals, healthcare professionals and medical resources (World Health Organization 2020). There are no clinically approved drugs/therapies for COVID-19; however, broad spectrum antibodies are under investigation in clinical trials. Vaccine investigations are also ongoing with several efforts approved by the US Food and Drug Administration (FDA) (Shereen et al. 2020; The New York Times 2021).

Factors of predisposition and patient severity

While any healthy individual in a population is vulnerable to SARS-CoV-2 infection, several health factors and comorbidities elevate the risk of some individuals for both infections and deaths. Smoking and alcohol can be major lifestyle risk factors of COVID-19 infections. However, there is no large statistical evidence to prove this phenomenon.

In a retrospective multicenter cohort study conducted in affected patients in Wuhan, China, several compulsive comorbidities have been explored. Majority (~ 30%) of the affected individuals have been identified with hypertension, making it the major susceptible factor. Diabetes was the second most susceptibility factor followed by coronary heart disease which were 19% and 8%, respectively. Old age was also identified to be a significant factor for infection severity and death (Zhou et al. 2020).

Global diagnostic tools

As the virus rapidly progresses to the respiratory tract through 5–7 days of infection, a nasopharyngeal (NP) swab and/or an oropharyngeal (OP) swab is recommended for screening and early diagnosis (Tang et al. 2020). For more severe symptoms, lower respiratory specimens such as sputum and/or endotracheal aspirate or bronchoalveolar lavage is investigated. Blood and stool samples are also considered for additional examinations. Since coronaviruses are enveloped single stranded ribonucleic acid (RNA) viruses, laboratory diagnosis is performed using real-time reverse transcription polymerase chain reaction (rRT-PCR) reactions (Center for Infectious Disease Research and Policy 2020).

In addition, chest X-rays and CT scans are also regularly captured to diagnose severe respiratory conditions like pneumonia in critically ill patients. Viral genome sequencing is also a developing scenario to monitor any evolving mutations in the virus which, however, has not been widely adapted for clinical investigation (World Health Organization 2020; ScienceDaily 2020).

Treatment/control methods

As per Centers for Disease Control and Prevention (CDC), there are no current preventive or treatment drugs approved by FDA. However, several drugs and strategies are being utilized for palliative care. As a measure for supportive care, oxygen supplementation and mechanical ventilation are being utilized in intensive care facilities for COVID-19 infection (Centers for Disease Control and Prevention 2020).

Several broad-spectrum antiviral drugs such as remdesivir are being extensively investigated for treating COVID-19 infected patients. Certain antimalarial and anti-inflammatory drugs such as hydroxychloroquine and chloroquine are also under clinical trials for use as prophylactic and therapeutic for COVID-19 infections (Centers for Disease Control and Prevention 2020). A list of commonly utilized medications for management of COVID-19 patients is presented in Table 2. Precautions, strict quarantining and patient isolation are also effective strategies followed for containing the viral spread and transmission (Centers for Disease Control and Prevention 2020).

Table 2 Table presents a list of COVID-19 primary management medications, drug class and treatment indications

Challenges in clinical settings during viral pandemics

Viral pandemics have a history of invading and destroying millions of people across the world since the very known “Spanish flu pandemic” in 1918. Death rates recorded back then were extremely high across the globe with various contributing factors like availability and access to healthcare facilities, public health infrastructure, population density, nutritional factors, etc.

Apart from frontline medical resources, intense research on viral characteristics, drug/vaccine discovery in limited time span and large scale production of the discovered vaccines are all equally challenging. However, economic and technological developments are powerful resources against evolving global emergencies (Oshitani et al. 2008).

The recent COVID-19 pandemic has been a clinically challenging threat to more than 100 countries, with Europe, North America and China hit the hardest. Novel viruses such as COVID-19 display continuously evolving genomic patterns and sequences, which could also vary among different races or ethnicities. In such cases, treatment/management decisions cannot depend only upon clinical manifestations. This contributes to the challenge as their sequences and/or protein structures may not be available for deeper investigations (Li et al. 2020). Several retrospective analyses of integrated patient data including deoxyribonucleic acid (DNA), RNA, host immune profile, cytokine profile, blood and biochemical reports yield direct correlations with patient recovery, disease severity and even death (Thevarajan et al. 2020). Several other technological developments such as virtual reality, AI, AI-assisted robots, AI-assisted chatbots, remote and virtual diagnosis and consultation, etc., also aim to assist healthcare professionals with more rapid and accurate solutions (BROOKINGS 2020; The British Broadcasting Corporation 2020).

This article is one such endeavor to emphasize existing clinical challenges and the routes through which the high throughput AI-enabled multiomics is beginning to assist healthcare from disease prediction, patient stratification to drug/vaccine discovery in COVID-19 pandemic crisis.

Role of AI-enabled multiomics in COVID-19 investigation

The aggressive nature of COVID-19 has led to its intractable spread across various countries rapidly increasing the death rates. Due to its unpredictable complexity and novelty, clinical diagnosis and treatment are burdensome for healthcare professionals.

With the rapidly progressing nature of the virus, a high-throughput approach to detect physical and molecular characteristics more rapidly is essential to keep up diagnostic accuracy and enable drug/vaccine development. Thus several AI-enabled platforms are spiking into COVID-19 investigation from diagnosis to discovery (UC San Diego Health 2020). With better algorithms and quicker analysis, faster patient care can be clinically adopted. In this section, we discuss some of the AI-enabled tools and their usefulness in COVID-19 interventions.

AI-enabled multiomics and patient stratification

The SARS-CoV-2 has been identified to cause infections irrespective of an individual’s current or previous medical conditions. However, multiple host-dependent molecular factors contribute to recovery/serious illness in addition to clinical manifestations. Clinically, most of the infected patients have been documented to present varying degrees of fever, cough, fatigue, myalgias, shortness of breath, to name a few. However, some are physically asymptomatic with or without internal advancement as mentioned earlier (Qiu et al. 2020; Pan et al. 2020; World Health Organization 2020). This suggests that clinical presentations alone cannot be predictive of virus penetrance and progression internally.

For instance, a case study published in 2020 by Thevarajan et al., utilized multiomics approach to investigate host immune responses of a 47-year-old female COVID-19 patient. Upon retrospective investigations at different time points throughout her infection, several factors like antibody secreting cells (ASCs), Follicular helper T cells (TFH cells), activated CD4 + T cells, CD8 + T cells, immunoglobulin M (IgM) and immunoglobulin G (IgG) antibodies were found to be elevated. Upon cytokine and chemokine profiling during symptomatic phase of the disease, decreased pro-inflammatory cytokines and chemokines were observed. A risk genotype in Interferon Induced Transmembrane Protein 3 gene (IFITM3 gene) (rs12252) for influenza was also observed on genetic testing. Interestingly, 26% of the Chinese population carried the risk genotype as per 1000 Genomes Project which was surmised to reflect COVID-19 spectrum in china (Thevarajan et al. 2020). This case study illustrates the diverse spectrum of essential host-dependent molecular parameters that could contribute to patient recovery in COVID-19 disease. However, healthcare professionals are needed to be focused on treating healthcare needs of patients. Thus AI-enabled multiomic models could efficiently stratify patients based on strictly trained algorithms promoting appropriate care, faster recovery and proper allocation of hospital resources.

Polymorphisms in genes like Angiotensin I Converting Enzyme 2 (ACE2) are also contributing to increased vulnerability of COVID-19 infection (Alimadadi et al. 2020). AI-based algorithms can be trained with large genomic datasets and can be utilized as powerful tools to screen huge population for polymorphisms at high-throughput. Neural network based algorithms can also be utilized for large-scale screening and prediction of COVID-19 infections using complex data like patient respiratory patterns (Alimadadi et al. 2020).

Other significant molecular stratifiers of COVID-19 patients include immune markers, metabolome (e.g., liver enzyme alanine aminotransferase), cell surface receptors, genes responsible for other comorbidities such as cardiac diseases, diabetes and so on (Jiang et al. 2020). IL-6 is one such molecular marker, which has been identified to significantly increase with the disease severity among COVID-19 infected cohort and are also strongly associated with the need for mechanical ventilation. Increasing levels of this marker also predicted respiratory failure with high accuracy as shown in studies (Herold et al. 2020). Figure 1 depicts the levels of IL-6 in severely affected patients as compared with non-severe cases among COVID-19 infected patients investigated from 9 studies.

Fig. 1
figure 1

Relationship between IL-6 and COVID-19 disease severity. IL-6 levels are reported to be higher in severe COVID-19 infections than non-severe patients in studies mentioned below. Figure 1 is a compiled reanalysis of IL-6 data from 7 studies which consistently showed increased levels of IL-6 in severe cases than non-severe ones (Chen et al. 2020; Gao et al. 2020, Liu et al. 2020a, b; Yang et al. 2020; Qin et al. 2020; Fu et al. 2020; Thevarajan et al. 2020; Wu et al. 2020; Wang et al. 2020)

In addition to the clinical and molecular biomarkers, AI has been intelligently utilized by researchers in Massachusetts Institute of Technology (MIT) to predict cough patterns of asymptomatic COVID-19 patients. The study utilizes thousands of cough patterns to develop an AI model that can discriminate cough of an asymptomatic individual from uninfected individuals. If approved by FDA, this AI-based model can be an efficient tool for asymptomatic patient stratification from healthy individuals thereby assisting in reduction of viral spread and infections (MIT News 2020).

Patient surveillance and tracking, virtual healthcare assistance and high-speed data capture and retrieval

Migration and community gathering are the major causes of rapid multiplication of pandemic infections. Since, SARS-CoV-2 is capable of incubating asymptomatic for ~ 7–14 days inside the human system, early isolation before clinical presentation has been challenging. In such cases, AI acts beneficial to monitor migration in populations at risk or affected so as to enable efficient tracking. AI-enabled algorithms can also accurately predict individual risk of infection with sufficient information on family history, lifestyle, social and travel history, etc., AI models are also trained as information bots to deliver reliable general information about infection, speed of transmission, recovery rates, etc., catering general awareness (Panch et al. 2019; Kuziemsky et al. 2019).

Multiple AI-based data dashboards are also functional and beneficial in tracking infected cases, recoveries and deaths across the globe. They also display interactive maps for better visualization. Some of such platforms include Johns Hopkins’ CSSE and Microsoft Bing’s COVID-19 tracker (Naudé 2020).

Computer based algorithms are greatly helpful in these situations accommodating not only immediate healthcare needs but also catering to general information, live tracking, disease screening, etc., for future disease-based interventions.

AI-enabled multiomics in disease severity, clinical trial recruitment and drug development studies

Age has been considered a major risk factor of COVID-19 infection based on worldwide statistics. A model-based investigation of age group vs case fatality ratio also rendered a linear relationship of 0.32% vs 6.4% in individuals older than 60 years of age (Verity et al. 2020). However, infection and disease severity ratios may not necessarily follow age relationships, as the infections have been recorded in younger population as well (Verity et al. 2020).

Recently, an AI-based prediction of COVID-19 severity, investigated by NYU Grossman School of Medicine and the Courant Institute of Mathematical Sciences at New York University, in partnership with Wenzhou Central Hospital and Cangnan People’s Hospital, both in Wenzhou, China, was able to predict the need for patient respiratory support based on molecular factors. Three major features were utilized in the study which includes liver enzyme alanine aminotransferase (ALT) levels, myalgia and hemoglobin levels reaching an accuracy of up to 80% (ScienceDaily 2020).

Similarly, another study by Tan et al., identified significantly lower lymphocyte percentages in patients with severe and fatal disease using time-lymphocyte percentage model (TLM) (Tan et al. 2020). The study also identified that the patients who recovered from infections showed an increasing trend of lymphocyte percentage making them a useful predictor of disease severity.

Mortality also hits some patient cohorts who present extremely elevated cytokine profiles. From a retrospective study in China, it has been identified that some infected patients presented excessively high levels of ferritin and IL-6 regarded as “cytokine storm syndrome”, being interpreted as a cause of mortality. Virus induced hyperinflammation resulting in increased levels of interleukin-2 (IL-2), interleukin-7 (IL-7), granulocyte colony stimulating factor, interferon-γ-inducible protein-10, macrophage inflammatory protein (MIP)-1alpha and tumour necrosis factor alpha could all be responsible for elevating patient fatality. Application of AI-enabled multiomics analysis in these areas could accurately predict the need for immunosuppressive drugs which could otherwise be life-threatening in viral infections (Mehta et al. 2020). A compiled representation of multiple molecular markers from over 450 individuals accumulated and reanalyzed from different studies and case reports on COVID-19 is depicted in Fig. 2 (Huang et al. 2020; Broman et al. 2020; Hashemian et al. 2020). This shows the contribution of molecular representatives in determining potential candidates for clinical trial and/or drug development investigations.

Fig. 2
figure 2

Multi-biomarker profiling of severe and non-severe COVID-19 patients. Figure represents the levels of multiple biomarkers as observed in severe and non-severe COVID-19 patients. Cells mediating immune reactions including stimulatory T cells, helper T cells, NK cells, T cells and B cells are higher in non-severe cases indicating events of immune reactions. On the other hand, biomarkers such as C-reactive proteins, serum ferritin, neutrophils are all higher in severe COVID-19 patients. These individual biomarkers alone may not be conclusive of the disease, but their initial screening may enable better understanding of the disease severity for efficient patient stratification

In addition to patient stratification and clinical trial recruitment, AI-based drug prediction models are highly evolving. With the advancing availability of enormous patient data and viral structural/sequential information, it is achievable to train AI algorithms to predict efficacy against certain class/specific drugs. For instance, an AI-based model developed by researchers in Korea and United States has been identified to predict targeted drugs to inhibit the replication of 2019-nCoV. The model utilizes 1 dimensional (1D) string of input sequences as against the unavailability of 3 dimensional (3D) protein structures, predicting the inhibitory potencies of antiretroviral drugs like atazanavir, efavirenz, retonavir and dolutegravir using Ki, Kd and IC50 values. Molecular docking studies have also been utilized for deeper investigation of prediction accuracies (Beck et al. 2020). Due to the flexibility of AI-enabled models, they show great potential in several drug developmental studies.

AI-enabled high throughput prediction of clinical images for early signs of pneumonia

Pneumonia is a major cause of severity and death in COVID-19 infections. However, even expert radiologists are overwhelmed with huge radiomic data with increasing numbers of CT scans and chest X-rays of affected individuals. Thus, regular monitoring of images for early diagnosis is a challenge due to continuously increasing burden of infections.

UC San Diego Health has developed an AI-assisted model enabled by Amazon Web Services (AWS) to accurately predict, monitor and diagnose pneumonia with patient CT scans and chest X-ray images. This model has provided experts with new insights in about 2000 clinical images. Interestingly it has also predicted a patient’s risk of developing pneumonia with early signs in chest X-ray, which would have been missed otherwise (ScienceDaily 2020).

Several other AI algorithms are being investigated for image analytics in COVID-19 patients. These algorithms are usually trained with equal/sufficient data under 3 major categories including chest X-rays from healthy individuals, pneumonia and COVID-19 affected patients. Training is usually focused to maintain consistent prediction accuracy even with low quality data with appropriate preprocessing. Radiomic features are then extracted, balanced and trained under various data classifiers such as Random Forest (RF), Decision Trees (DT), etc. Such models have established a maximum accuracy of 0.973 using RF (Kumar et al. 2020).

AI-enabled multiomics studies, apart from the early prediction of pneumonia, can also be utilized to extract various other useful information for treatment surveillance and monitoring, which includes pulmonary consolidation, disease resolution and fibrosis from chest X-rays and CT scans. This information at a higher accuracy and faster pace can assist timely stratification and resource allocation (Castiglioni et al. 2020).

A compiled representation of chest CT findings from severe and non-severe COVID-19 patients is depicted in Fig. 3. The figure shows significantly high prevalence of ground glass opacities and lung consolidation, fibrosis and air bronchogram in severe COVID-19 patients as compared with non-severe counterparts. Figure 2 also represents an increased prevalence of these measurement parameters during disease progression which dropped during the phase of recovery (Yun et al. 2021).

Fig. 3
figure 3

A retrospective investigation of chest CT findings to assist in COVID-19 patient stratification of severe and non-severe disease pattern. The chest CT findings presented in the figure shows 2 vital findings. The first major finding is that the prevalence of COVID-19 patients exhibiting conditions like ground glass opacities, lung consolidation, fibrosis and air bronchogram were higher during disease progression which dropped upon recovery phase of the disease. The 2nd major finding is that all the parameters (ground glass opacities, lung consolidation, fibrosis and air bronchogram) were higher in patients with severe disease pattern than non-severe disease pattern. From these evidences, an initial screening of COVID-19 patients using chest CT could be a beneficial strategy towards patient stratification

Though developers claim further investigation for AI-assisted prediction models, efforts like this are much required for high-speed clinical interventions before witnessing serious health effects.

AI-enabled COVID-19 vaccine investigation

The enormous predictive strength of AI has also been reflected as a potential aid in vaccine development for COVID-19. Many tools are being developed to thus enhance the laboratory efforts of vaccine development. The “AlphaFold”, a 3D structure prediction tool for genetic sequences, developed by Google DeepMind has predicted the structure and spike of SARS-CoV-2 enabling scientists to better understand their receptor binding and cellular entry. A researcher from the University of Texas at Austin, has also created a 3D atomic scale map to predict the spike protein. Many other efforts are also underway in this field so as to assist in the development of anti-viral components and vaccines (Bullock et al. 2020).

Another notable contribution by Inovio pharmaceuticals, in the generation of DNA-based vaccine called INO-4800. The technique has utilized viral genetic sequences to train the machine learning algorithm (Thanh et al. 2020). Similar approaches by Moderna therapeutics generated a modified messenger RNA (mRNA)-based vaccine called mRNA-1273 (Thanh et al. 2020), which has been currently approved in Switzerland and for emergency use in U.S., U.K., E.U., others (The New York Times 2021). Several other vaccine efforts like Comirnaty (Pfizer-BioNTech), AZD1222 (an Oxford-AstraZeneca effort) which are currently in clinical phase trial 2 and 3 have been approved by FDA for use in several countries (Information dated February, 2021) listed in Table 3 (The New York Times 2021). These approved vaccines were developed based on many strategies such as mRNA, protein, inactivated viruses, etc. (The New York Times 2021). Many of these vaccines have utilized AI-enabled systems for initial screening for vaccine development to pace up the whole process of laboratory investigations and clinical trials thus developing solutions faster than expected.

Table 3 Table presents a list of COVID-19 vaccines, method of development and clinical approval status

Viral epidemics or pandemics usually experience multiple evolutionary stages with its nuclear material thus developing newer strains. Sometimes these new strains may be more infectious even than the original counterparts (Johns Hopkins University 2020a, b). These strains are a common pattern in any viral or bacterial outbreak before the invading organisms reach the state of non-infectiousness (Johns Hopkins University 2020a, b). Similarly, the 2019 pandemic also gave rise to newer strains of SARS-CoV-2 coronavirus detected in many parts of the world (Johns Hopkins University 2020a, b). A particular strain called B.1.1.7 was detected in southeastern England in September 2020 which soon became common in the United Kingdom, accounting for more than 60% of the new COVID-19 cases in December. Several other variants were also detected in areas including South Africa, Brazil, California and others. Unfortunately, several of these strains also alter the composition of the spike protein making them “stickier” (Johns Hopkins University 2020a, b).

More evidences are still required to understand the severity of the disease due to these new strains and most importantly, the efficacy of the current vaccines (Johns Hopkins University 2020a, b). However, many vaccine developers claim that some of these developed vaccines (e.g., mRNA vaccines) could still offer protection against the changing viral pattern and the spike protein but more evidences are still warranted (Johns Hopkins University 2020a, b).

Conclusions and future implications: AI-enabled multilayer data analysis to assist COVID-19 and other global pandemics in the future

Some significant molecular markers and AI-enabled patient tracking, screening and disease detection approaches were discussed in the study. Each of these markers individually and together contribute to efficient patient stratification, prediction of disease severity and therapeutic interventions of COVID-19 as discussed in various sections of the article. With the use of integrated AI-enabled multilayer models, it could be possible to generate high-throughput outcomes from various data sources like clinical, pathological, DNA, RNA, proteome, metabolome, etc. With large data being accumulated over time in different COVID-19 patients and conditions, effective set of training instructions can be delivered to render more accurate predictions of unknown future events (Grapov et al. 2018). Since AI-assisted models function with minimal manual interventions, handling large cohorts under stressful circumstances becomes highly feasible (Davenport and Kalakota 2019; Rong et al. 2020; Azuaje 2019). Additionally, AI-enabled multilayer models can relieve biologists of complex data interpretations with the use of simpler visualizations for effective medical decisions (Chakraborty et al. 2017; Chin-Yee and Upshur 2019).

We at iNDX.Ai, are keenly enduring to investigate AI-enabled screening and prediction methods of COVID-19 infections for patient stratification, image analytics and drug discovery using our AI platforms namely iCE and iCore.

We signify that several such multiomics efforts are collectively required to overcome the challenges of COVID-19 in treatment, long-term prevention, prognosis ad response prediction, risk analysis, surveillance, etc. which are also presented in Fig. 4. Thus, this article plays its substantial role in emphasizing the competencies of molecular biomarkers and AI-enabled multilayer model in COVID-19 pandemic interventions.

Fig. 4
figure 4

Applications of multiomics in COVID-19 investigation. The figure depicts various applications of multiomics strategies (e.g., genomics, transcriptomics) such as disease susceptibility/risk prediction, disease diagnosis and prognosis, drug response and monitoring. Thus, such strategies could collectively render assistance in the overwhelming healthcare needs during pandemics and beyond