Nearly four-score-years ago, the advent of ENIAC was one of the first examples of an electronic computational device that could be adapted (reprogrammed) to different tasks with human guidance. Shortly thereafter, the commercial development of integrated circuits ushered in the modern era of digital computing, substantiating a promise of “bigger is better” and setting off the race of seemingly exponential growth in microprocessor complexity forecasted by Gordon Moore’s Law. The world now bears witness to an effluence of computational power every time a modern smart phone is awakened by a glance, only tapping into a minute fraction of the computational power it possesses.

This wealth of excess computational power facilitated the modern realization of artificial intelligence (AI). The first signs of AI in the public domain existed long before ChatGPT dominated news headlines; we have come a long way from the earlier, simpler iterations of chatbots both online and via telephone trees over the past decade. Voice recognition itself represents the culmination of iterative honing of algorithms at the development lab, and many systems still require “training sessions” with the end user to further improve the quality of output. Coupled with voice recognition, at first, chatbots did not seem much different than replacing touch tone input using a finger with a vocal interface. However, the phone tree algorithms themselves became able to offer options and answers, only limited by the complexity of discrete programing invested by the customer service agency. Perhaps newer AI applications can satisfy customers more frequently, alleviating the need to “speak to an agent.”

In this issue of LUNG, Lew and colleagues provide a focused appraisal of how AI has been applied to study sarcoidosis [1]. As reflected in Fig. 1, AI represents the overarching goal of computational assistance, with machine learning (ML), deep learning (DL), and neural networks (NN) representing cascading subsets reflecting technological evolution, increasing depth of analytic capability, and decreasing dependence on human supervision.

As outlined in the featured manuscript [1], most published studies of AI in sarcoidosis represent ML, often an evaluation of the performance of interpreting a diagnosis of sarcoidosis based on a discrete training data set (such as imaging modality, functional testing). In many cases ML is a one-directional, feed-forward analysis, and very dependent on human supervision to provide data sets, to define key characteristics of the disease (i.e., sarcoidosis experts), and to implement refinements to algorithms. Although the media has only recently been abuzz about AI, ML techniques such as cluster analysis have long been tools for clinical and scientific investigation [2, 3]. Using image processing as an example, more recent applications of DL employ NNs integrating specialized layers of analysis (one layer identifies edges/outlines, second layer identifies different densities, third layer pooling this information to identify target objects) that identify key defining characteristics to improve case identification. In its earliest implementations, NN could be conceptualized as a refinement of ML (though still largely a one-way) imaging processing utilized by the brain (image—> target identification) and translated into what is now found in consumer electronics. The more recent development of recurrent NN (RNN) forecasts the ability for a machine to automatically re-process results from first-pass analysis to adjust the performance of its algorithms without human intervention; the amount of computing power needed to support RNN limits its widespread implementation.

In this fashion, Radiomics can be considered a focused application of DL integrating imaging data with other medical data to study pulmonary sarcoidosis. The precision of radiographic interpretation can be enhanced by RNN evaluation of features “not visible to the naked eye”, and diagnostic interpretation improved by including multi-modal data such as pulmonary function studies and blood tests. The number of features evaluated would be limited to the granularity of data gathered by the imaging study and the available computing power. The recent applications of DL distinguish or identify sarcoidosis from a limited set of outcomes, such as normal lung, tuberculosis, or “not sarcoidosis”. Future efforts will require richer data sets to train systems to distinguish sarcoidosis from a wider scope of lung diseases or from a variety of systemic illnesses. However, Radiomics applied to sarcoidosis promises to improve diagnostic accuracy for distinguishing pulmonary sarcoidosis from other lung diseases and avoiding the need for biopsy in higher risk, medically complex patients.

While strategies implementing RNN could iteratively refine DL algorithms with limited-sized data sets, there are a few important considerations. The precision of an AI application will depend on the complexity of the code (e.g., number of layers and use of iterative RNN) and the size and quality of the data set. Approaches must be developed to account for the clinical heterogeneity of this disease yet limit the risk of overfitting, as voiced by the authors [4]. Other variables associated with sarcoidosis including age, gender, socioeconomic factors, and geographic locality may limit generalizability of a single AI model. It is uncertain if a single clinical center could amass a sarcoidosis data set of the size and granularity akin to what MIMIC III has provided for the study of sepsis and critical care medicine. Conceivably, the development of a consensus data set could enable a crowdsourcing approach to data collection [5].

Currently, most clinical data sets represent individual snapshots of a patient’s condition, often limited by the data available from a single time point. For AI to forecast outcome, there must be a deliberate effort to assemble longitudinal databases. There is a growing recognition that the patient experience is not completely quantified by medical evaluation alone, and the proper integration of health-related quality of life must be determined [6]. Taken one step further, the ability to forecast outcome (i.e., identify endotypes) will require a rich data set that must include both serial evaluation (snapshots over time) and include analytes that are not commonly gathered through routine medical care, such as biomarker measurements or gene expression analysis. By implementing AI via a different context, rich data sets can be mined to identify new biomarkers that are not apparent from efforts to date [7]. The potential danger of larger, richer data sets would be the identification of patient classifications (phenotypes, endotypes) that are not easily understood by humans, which may limit their wide acceptance.

Although a diagnosis of sarcoidosis is often defined by an appropriate clinical history, appropriate biopsy, and appropriate response to therapy, in many cases a diagnosis of sarcoidosis is based on expert interpretation [8]; how can this be encapsulated in an algorithm? These factors may introduce unintentional bias due to data selection, which then affects how users (providers, patients) embrace the answers offered by AI models. Given the limited availability of sarcoidosis data sets for AI teaching let alone AI validation, the responsibility will likely fall onto humans to provide clinical reasoning and inference necessary for validating AI applications for sarcoidosis. While humans will always lack the capacity to mill large volumes of data, accepting a hybrid human-AI approach to finding answers to sarcoidosis may promise the best of both worlds, exceeding the capacity of either alone [9].