Main

Computation has profoundly shaped the way we approach sensing. In the realm of biosensing, for example, signals are acquired—often at high cost—with various sources of noise, including the stochastic behaviour of molecular interactions, imperfections in fabrication, chemical and/or optical signal transduction mechanisms and human variation in terms of sample handling, as well as physiological differences and natural variations inherent in large test populations1. With such noisy and sparse sensing landscapes, computational methods have evolved to help us garner meaningful information from raw sensor data. Naturally, as Moore’s law has progressed and computation has become more powerful, cheaper and more widely accessible, it has in many ways handled an ever-larger share of the noise burden when compared with the sensing hardware itself. For example, support vector machine-based algorithms have increasingly been employed for sensing-related analysis such as material characterization, hyperspectral geological and environmental mapping, and cross-reactive sensor arrays forming (for example) an ‘electronic/optoelectronic nose’ for the identification of trace amounts of explosives and toxins, as well as for diagnostics and genomics applications including pattern recognition of biological pathways for disease prediction1,2,3,4. More recently, deep learning and neural networks have shown immense promise in signal analysis, beyond the capabilities of traditional machine learning approaches, inferring complex nonlinear patterns from high-dimensional data5,6,7,8,9. Furthermore, neural networks provide a major advantage in computational prediction speed compared with other traditional signal recovery approaches based on, for example, compressive sampling and iterative signal reconstruction, and can be readily integrated into common processors on mobile phones and tablet PCs, paving the way for cost-effective mobile and powerful sensing and diagnostic systems10,11,12.

Despite this progress in the field of sensing at large, there is an important opportunity that has not yet been extensively explored: computation and machine learning methods can fundamentally change the hardware designs of traditional sensors and can be used to holistically design intelligent sensor systems. In such designs, computation and statistical learning tools must be utilized in the design phase of the sensing instrument, taking into account various inherent sources of noise and variations in the signal generation and decoding schemes that are employed. However, in many current examples of sensing systems, this is not the case: the sensing framework is typically designed on the basis of a ‘sequential’ merger of the hardware output and computational analysis applied on this output, following signal acquisition with traditional sensor hardware. This is a natural result of the analogue-to-digital transition, where existing sensor designs have later been empowered by computational analysis.

In this Perspective, we specifically focus on an emerging opportunity, namely, computational sensing systems that merge computation and machine learning-based statistical analysis of signals as a fundamental part of their hardware design to optimize sensing performance. An overview of this broad concept is presented in Fig. 1, illustrating an iterative workflow that begins with an initial design and creates a computational sensor based on the full integration of computational processing, feature optimization and statistical learning at the hardware level. We generally refer to this paradigm as machine learning-inspired instrument design, and from the specific point of view of this Perspective, machine learning-inspired sensor design (Fig. 1). This computational framework proposes answers to the following question: if today’s powerful computational resources and algorithms had existed before a well-known traditional sensor system or instrument were designed, how would the sensing system be fundamentally changed and improved? One could ask the same question for imaging systems and microscopes, for example, the answers to which fall under the broad category of the computational imaging field. There are various examples from the computational imaging field that highlight this emerging opportunity to use computational techniques and statistical learning for designing intelligent imaging systems13,14,15; however, here we will specifically focus on how machine learning-inspired sensors can manifest a transformation in the design and operation principles of next-generation intelligent sensors. We anticipate that these new computational sensors enabled by machine learning will foster a plethora of new applications by enabling unique sensing capabilities in different areas including environmental monitoring, medical diagnostics, the internet of things (IoT), autonomous vehicles and security.

Fig. 1: Overview of machine learning-enabled intelligent sensor design.
figure 1

An initial design, located in the centre of the diagram, is produced through standard engineering practices or given a randomized initialization in terms of its transduction elements or sensing components (that is, features), which are denoted here by various shapes and colours. (1) The machine learning-enabled intelligent sensor design workflow begins with the acquisition of the sensing data, illustrated as a multiplexed ensemble of signals responding to the measurand(s). (2) These data, Xi, are then used to train a learning model outputting sensing results as predictions, \({y}_{i}^\prime\), illustrated here by a diagram of circular nodes and interconnections. (3) A cost function \({J}({y}_{{i}},{{y}}_{{i}}^\prime )\) is used to evaluate the learned model, using the ground-truth sensing information, yi, with the added purpose of scrutinizing the ensemble of transduction elements or features. (4) The sensing hardware is redesigned, symbolized here by the cog icons, on the basis of the statistical analysis in (3), eliminating non-informative or less useful features and/or replacing such features with adjustments thereof or alternative sensing elements. Here, the subscript \({{i}} = 0,..,{{N}}\) is used to denote various iterations of this design workflow, as it can be repeated to improve sensor performance in terms of a user-defined cost function (lower left), concluding with a final machine learning-enabled intelligent sensor design (upper left).

In the next section, we begin with an overview of emerging computational sensing platforms with a discussion of some of the recent examples found in the literature. Following that, we will discuss machine learning-enabled sensor system design as a methodology with potential applications in point-of-care sensing and genomics, as well as hyperspectral image sensors.

Overview of emerging computational sensing platforms

Recent advances in nano-engineering, three-dimensional (3D) fabrication and manufacturing methods, as well as flexible and compliant electronics, have led to various new sensors and transduction elements that serve as exciting testbeds for computational sensing design. For these emerging sensing platforms, computational tools and algorithms form a vital part of their functionality, enabling these platforms to realize meaningful performance advantages over previous generations of sensor technologies in terms of form-factor, cost and resolution/sensitivity, for example. One powerful mathematical framework that is often leveraged in such computational systems is compressive sensing16,17,18. In compressive sensing, the goal is to encode a given signal, x, with an a priori defined sparse sampling operator or encoder, θ. By solving an underdetermined linear equation representing this sparse sampling/sensing operation, that is, y = θx, the original x can be reconstructed from a smaller number of measurements (compared with what is dictated by the Nyquist sampling theorem) assuming that the signal can be represented as a sparse vector in some mathematical domain, such as the wavelet domain16.

The field of spectral sensing provides a rich set of examples of how this compressive signal recovery framework can transform traditional grating and line-scan CCD (charge-coupled device)-based spectrometer designs into much more compact computational spectroscopy tools. For example, a recent study demonstrated computational spectral measurements with a single nanowire (Fig. 2a)19. Similar computational spectroscopic sensing systems have also been realized with a wide range of spectral encoding elements, including quantum dots, Fabry–Perot cavities, liquid crystal displays and micro-electromechanical systems, among others20,21,22,23,24,25,26,27,28. In one implementation, tiled nanostructures were utilized as distinct engineered spectral filters, encoding the spectrum of the incident light, which was then reconstructed using compressive sensing-based algorithms (Fig. 2b; ref. 29). A similar spectral encoding/decoding strategy through tiled dielectric metasurfaces has also been used for ultrasensitive and compact biosensing of molecules (Fig. 2c; ref. 30). In addition to optical/photonic implementations, many computation-enabled sensing systems have been reported in other fields31,32,33,34,35, such as the recent demonstrations of vibration and motion detection, as highlighted in Fig. 3a.

Fig. 2: Emerging examples of computation-enabled hardware for optical sensing and spectroscopy.
figure 2

a, A compositionally engineered nanowire for ultra-compact computational spectroscopy. A real-colour PL image with the corresponding spectra is shown (top left); scale bar, 20 µm. Fluorescence micrograph is also shown (bottom left); scale bar, 10 μm. In this system, spectral information is obtained through probing the photocurrent (denoted by I1, I2In) across an array of nodes that define segments along the nanowire of varying semiconductor alloy compositions (top right), with each segment creating a distinct absorption spectrum. A compressive sensing algorithm using an L2-norm regularization term is then implemented to reconstruct the target spectrum (bottom right), denoted by F(λ), from the measured photocurrents using the known spectral response functions denoted by R(λ), where λ represents the wavelength within the spectral sensing range. PL, photoluminescence; ADC, analog-to-digital converter. b, An on-chip spectral encoder for compact single-shot spectroscopy based on tiled photonic crystal (PC) slabs (top). The unknown spectrum of the incident light is reconstructed through the use an L2-norm based regularization method using the spectrally-encoded information measured through each photonic crystal slab using a complementary metal-oxide semiconductor (CMOS) image sensor. SEM, scanning electron microscope. c, Ultrasensitive and compact biosensing with tiled dielectric metasurfaces and a bar-code-based decoding scheme (left). The refractive index sensitivity is shown as a demonstration of the computation-enabled sensing scheme (right). Figure adapted with permission from: a, ref. 19, AAAS; b, ref. 29, Springer Nature Ltd; c, ref. 30, Springer Nature Ltd.

Fig. 3: Emerging examples of computation-enabled distributed sensing platforms.
figure 3

a, A randomized, resonant metamaterial for the identification and localization of vibrations. Each node has a random effective mass (meff) and resonant frequency and is connected to other nodes by springs (k0) and dampers (c0) comprising a single randomized coupling network (RCN) (top left). When six different RCNs are assembled, the resulting system (top right) is used to computationally identify the location of vibrations with a single sensor, where colour denotes the randomized resonant frequencies of the various nodes. Vibrational modes of the combined RCN system are shown, excited at various terminals (black dots) at 600 Hz. b, Smart triboelectric flooring system for user recognition, made from a polyethylene terephthalate (PET) film friction layer, silver electrode layer, and a polyvinyl chloride (PVC) base layer. Each floor tile contains a randomized binary electrode design with differing fill factors (middle). Using two voltage readouts (V1 and V2), a neural network is able to identify individuals by their gaits, with the confusion matrix from 10 independent users labeled U1–U10 shown in the bottom right, reporting an accuracy of 96%. In parallel, a microcontroller unit (MCU) is used for analysing position sensing data. c, Activity recognition using a magnetic induction-based wearable sensor network of transceivers (Txi, RX) (inset, left). Location markers on the body, denoted by Mi, define the torso and limb segments and are used to calculate the position of the transceiver coils. A recurrent neural network (RNN) uses the transceiver signals to classify user activity, with xt denoting the normalized input data at various time points defined by a sliding window of 1 s (that is, T = 1 s). The average prediction scores, denoted by \({\hat{y}}_t\), are output by the RNN before being converted into an average class probability denoted by \({\hat{O}}_t\). Figure adapted with permission from: a, ref. 31, Springer Nature Ltd; b, ref. 36, Springer Nature Ltd; c, ref. 37, Springer Nature Ltd.

However, it is important to note that these earlier examples of computational sensing testbeds were limited by their inability to learn and properly take into account statistical features at their input signals. Therefore, other approaches based on statistical learning (and data-centric training) must be invoked to advance the capabilities of these computational sensing platforms31,32,33. Such learning frameworks rely on statistically large sets of sensing data, properly matched to a verified target dataset acquired with, for example, gold standard systems or known outputs. A mathematical cost function, denoted J(y,y′) in Fig. 1, is then defined and evaluated during the training process on the basis of the difference between the verified target (gold standard, y) and the algorithmic output (y′) of the sensor. Such data-centric training methods are especially well suited for applications involving wearable sensing or engineered macroscopic environments, for example, as the sensing hardware is often first initialized as a distributed network before real-world interaction can elucidate the statistical mapping between the raw acquired signals and meaningful sensing information or output. Recent notable examples of this emerging opportunity have been reported for personal gait identification through the use of distributed contact electrodes in floor mats36 (Fig. 3b), as well as for human activity recognition with an ensemble of wearable magnetic induction sensors37 (Fig. 3c).

Despite the integration of data-driven inference algorithms with the sensing hardware of these earlier systems, the presented examples still do not holistically benefit from a learning-based framework to iteratively design and lock-in to the desired sensing function. For example, the arrangement and abundance of contact electrodes within the sensing mats (Fig. 3b) or the position of the motion sensors on the human body (Fig. 3c) could be optimized upon subsequent engineering iterations using machine learning-enabled sensor design framework as outlined in Fig. 1 (refs. 36,37). Similarly, many of the aforementioned examples also rely on intuition or quasi-random31 parameters to design their sensing hardware, with the tiled metamaterial geometries in particular (Fig. 2b,c) engineered on the basis of their approximate behaviour as optical filters29,30. This is of course a logical approach, given that the corresponding spectral reconstruction algorithms do not necessitate a mutually orthogonal basis for the encoding or sampling step. However, these design precedents underscore the suboptimal nature of the initial sensing hardware, and its associated features and parameter space, setting the stage for an exciting new opportunity in computational sensing field: machine learning-enabled intelligent sensor design, which is discussed next9,28.

Machine learning-enabled intelligent sensor design

With machine learning-enabled intelligent sensor design, a desired performance target is first defined. A computational algorithm is then utilized to iteratively converge to a given set or subset of features/parameters along with a corresponding statistical inference model that most accurately yield this target (Fig. 1). Such a computational design methodology can drastically improve the overall performance of a given sensor system through the implementation of locally optimal, but perhaps non-intuitive, design choices. We will next discuss some of the emerging examples of machine learning-enabled sensor designs in point-of-care diagnostics and synthetic biology fields.

Emerging examples of machine learning-enabled sensor designs

Standard inverse design methodologies typically employ a priori known analytical models to map the desired response to the final hardware design38,39,40; however, the use of data-driven and learning-based frameworks in this context has important implications for designing computational sensing hardware and subcomponents across a variety of applications. For example, such an approach has been demonstrated in the field of biomedical sensing to design a computational point-of-care sensor for rapid Lyme disease testing (Fig. 4)41. A multiplexed paper-based sensor and a mobile-phone based reader were used to measure eight immunoglobulin-M (IgM) and eight immunoglobulin-G (IgG) antibodies associated with Lyme disease. A computational sensing workflow was leveraged to determine a ‘smart-panel’ of antibody measurements that improved the diagnostic sensitivity and specificity (using a neural network-based inference model), while also lowering the cost-per-test upon a subsequent blind testing phase with human serum samples. Following the strategy outlined in Fig. 1, such a multiplexed computational point-of-care sensor could undergo additional design iterations, in which immunoreaction spots not selected in one round of optimization could be replaced with other redundancies of the selected chemistries or positive and negative control reaction spots, further strengthening the diagnostic predictive power of the computational biosensor.

Fig. 4: Overview of machine learning-based design of a point-of-care diagnostic sensor for Lyme disease.
figure 4

This workflow, mirroring Fig. 1, begins with acquiring multiplexed sensing data with a paper-based biosensor and a mobile-phone reader (1). The multiplexed sensing membrane supports various Lyme antigens, which capture different Lyme-associated antibodies in patient serum and report their relative concetrations after the completion of the colorimetric assay using gold nanoparticles, with two separate assay cartridges used for capturing the IgM and IgG class of antibodies. These sensing data, matched with a gold standard clinical diagnosis, are then used to train a neural network-based diagnostic algorithm, which infers a Lyme positive or Lyme negative diagnosis from the multiplexed IgM and IgG antibody measurements (2), after which the optimal antigen panel, a subset of the full antigen measurement features, is determined through a feature analysis method called sequential forward selection with the area under the receiver operating characteristic (ROC) curve as the objective function to be optimized (3). The selected panel (that is, the reduced features set) is then used to measure blinded patient samples and infer a diagnostic result using the trained neural network model, yielding a final validated computational biosensor design, with the ROC curve and confusion matrix of the blind testing result shown herein (4). Figure adapted with permission from ref. 41, American Chemical Society.

Another recent example from the field of synthetic biology similarly illustrates the success of a machine learning-based sensor design workflow42,43. Here, a deep learning-based design framework was demonstrated to engineer RNA molecules (referred to as toehold switches) as programmable response elements to target proteins and small molecules, potentially enabling numerous next-generation biosensing technologies42. Designing the proper RNA sequences to synthesize and execute a specific molecular sensing task has been a major open-ended challenge. To address this design hurdle, a ‘sequence-to-function’ framework based on deep learning was therefore used to predict the real-world function and response of RNA toehold switches, using their sequence as the input to a neural network (Fig. 5a)42. This approach was found to outperform the prediction accuracy resulting from standard thermodynamic and kinetic modelling, and even indicate underlying sequence motifs most relevant to executing a desired function. This work was then taken further, partially mirroring the workflow in Fig. 1, to re-engineer poorly performing toehold switches using the knowledge gained from the iterative analysis of the RNA features using the trained neural network model (Fig. 5b)43. This application-specific workflow, termed sequence-based toehold optimization and redesign model (STORM), therefore presents a powerful computational tool for the genomic sensing community to use to design and optimize toehold switches, and has in fact already been employed to engineer highly relevant SARS-CoV-2 viral RNA sensors43.

Fig. 5
figure 5

a, Overview of a deep learning-based framework to design programmable RNA toehold switches. RNA tool selection is first performed using a library of sequences for synthesizing RNA toehold switches, with the toehold switch architecture shown (bottom left) containing a 12-nucleotide toehold (a/a′) and an 18-nucelotide stem (b/b′) unwound by trigger RNA. After synthesis, the toehold switches are characterized via a pooled sequential assay and analysed using various learning models (that is, multilayer perceptron (MLP), long short-term memory (LSTM), convolutional neural network (CNN)) (right) to predict their functionality in terms of ON/OFF signals from expression of a targeted gene. The deep learning model is in turn evaluated to reveal biological insights about key RNA components that yield the desired function. RBS, ribosome-binding site. b, STORM, an optimization pipeline devised to re-engineer poorly performing toehold switches using deep learning. Here, a traditional training procedure can be used to optimize the model weights starting from an initialization (that is, at time t  = 0) in order to best predict the ON and OFF signals resulting from a given fixed sequence (top). However, this procedure can also be inverted by fixing the model weights and target ON and OFF signals in order to determine a locally optimum sequence (that is, by exploring the sequence space), exemplifying the principles of intelligent sensor design. PWM, position weight matrix. Figure adapted with permission from: a, ref. 42, Springer Nature Ltd; b, ref. 43, Springer Nature Ltd.

These results demonstrate how data-driven and machine learning-based design approaches can overcome some of the shortcomings of analytical methods/modelling for the purpose of engineering sensors that operate in complex biochemical environments that include a wide-array of confounding variables. The core teachings of these examples could also be extended to various other forms of sensor design problems and provide excellent evidence that data-driven sensor designs can in general outperform intuitive designs that are solely based on analytical/theoretical modelling, especially if well-characterized training data are available to engineer and select sensing features that can statistically separate out various inherent noise terms or artifacts from the target signals of interest. It is also important to note here that transfer learning, a common training technique for deep neural networks, can be leveraged to reduce the data burden upon subsequent iterations of the machine learning-based sensor design workflow. Following this strategy, the next iterations of the intelligent sensor design would have the sensing algorithm already initialized with the weights and hyperparameters determined in the previous iteration, streamlining the entire design process. Transfer learning has already had profound impact on image classification via convolutional neural networks, for example, enabling new inference models with high accuracy to be trained from much smaller sets of image data44. This precedent could therefore similarly accelerate progress within specific sensing applications45,46, as research teams could share their effective sensing models and feature analyses to serve as the initial iterations of the computational sensor design workflow, partially alleviating the need to generate the large datasets that would be necessary to train a model from scratch43.

Feature selection as a computational sensing design tool

While quite powerful in general, one must be careful when using a data-driven design approach: sometimes, the high-dimensional space of training data may drown-out the meaningful correlations to the target sensing information. This phenomenon, termed the curse of dimensionality, presents design engineers with an ultimatum to either acquire much larger training datasets or to reduce the dimensionality of their computational sensing systems. Machine learning-inspired sensor design, therefore, attempts to systematically prioritize or select potential measurement features in terms of their statistical value and contribution for accurately predicting the desired sensor output. This process, known as feature selection, can thus be thought of as a way to objectively determine an ‘elite democracy’ of measurement features47. In principle, this process is analogous to the formal ‘design of experiment’ methodology, which is often employed in the material science and chemical engineering fields48,49. In this approach, feature optimization is managed in a data-limited setting through statistical analysis encompassing simple linear or polynomial bases for the sensing features, for example, as opposed to more complex function approximators such as neural networks.

Computational sensing systems can in turn use these selected features as optimal building blocks in subsequent iterations of the sensing hardware, fully realizing the combination of computation, statistical learning and sensor hardware and readout design, as outlined in Fig. 1 and exemplified in Figs. 4 and 5. Optimization and engineering of this feature selection process can benefit sensing systems in a myriad of ways: by mitigating various noise sources, reducing the complexity, cost, footprint and weight of the sensing instrument and generally reducing the data acquisition burden that is increasingly becoming an issue with the proliferation of high-throughput sensor systems driven by the IoT and the related big data paradigm. Furthermore, large-scale manufacturability of sensing technologies can also benefit from feature selection and the computational sensing workflow. Sensors that are fabricated through high-volume manufacturing often deviate in unexpected ways from their intended response, especially for chip-scale technologies that may rely on multi-step fabrication protocols. Learning frameworks can therefore be used in an industrial context to computationally account for and even exploit such fabrication realities, including statistical variances within batches of production. For example, iterative computational sensor design could begin by creating different production batches with various pseudo-random fabrication conditions that impact sensor performance. After the performance is evaluated across these batches with a user-defined cost function, a new generation of sensors can be produced following the fabrication protocols with the specific conditions inherited from the best performing sensors. This type of iterative sensor design, in certain ways, is analogous to the evolutionary algorithms commonly used for in-silico design of nanoantennas and other on-chip photonic devices that have well-defined forward models40,50. However, our proposed iterative sensor design approach involves the physical production/fabrication of sensors and their activation in real-world settings, covering a wide range of random or unaccounted factors, all of which can be compared and inherently screened through a learning algorithm. Although this approach would require high-throughput methods for screening sensors made in each generation (that is, design iteration)51, it could ultimately help with mass customization of sensor response for application-specific settings/conditions and even uncover cost and time savings that scale with high-volume production.

It is also important to emphasize that feature selection does not explicitly improve computational sensor performance. In fact, utilizing a subset of the physically measurable features can sometimes, especially in low-dimensional feature spaces, lead to poorer sensor performance due to the information discarded during the exclusive selection process. However, performance trade-offs such as this are inherent in most engineering applications and should be considered on a case-by-case basis, ultimately converging on user-defined design choices that embody the most appropriate sensor technology, given a set of performance, budget and cost-per-test constraints for the target sensing application. For example, some sensing systems, especially those used for environmental monitoring, are much more powerful and useful (providing much richer information) if they can operate in a widely distributed format. Therefore, the added benefit of the expanded spatiotemporal data collection capability in such a distributed sensing network might practically outweigh the decrease in performance for an individual sensing node. Furthermore, constraining system hardware to existing commercial electronics (such as photodiodes, CMOS image sensors and low-cost LEDs) allows sensors to take advantage of economies of scale and greatly reduce costs, improve accessibility and broadly benefit the overall sensing goal52,53. ‘Cost-aware’ feature selection, as exemplified in Fig. 4, therefore exists as a systematic way of making such design choices given an appropriately defined threshold of needed sensing accuracy, sampling rate, power requirement and so on54,55.

Such an approach is particularly well suited for point-of-care sensing platforms, for example, which are increasingly incorporating and benefiting from computational methods as a part of their function, especially those with multiplexing capabilities (Fig. 4)41,56,57. To have widespread impact, these platforms must remain affordable while achieving sufficient sensitivity and specificity. In fact, the rapid diagnostic technologies urgently needed to combat the COVID-19 pandemic form a highly relevant embodiment of these design challenges. For example, it has been shown that measuring multiple antibodies to SARS-CoV-2, specifically IgM, IgG and IgA, increases diagnostic sensitivity when compared with antibody tests that only make a single IgM or IgG measurement58,59,60. Moreover, the human immune response produces a variety of antibodies within the IgM, IgG and IgA classes that each bind to different pathogen-associated proteins, with the spike protein (S protein) and nucleocapsid protein (N protein) being the most dominant in the case of SARS-CoV-2 and other coronaviruses61,62. The human immune response is also highly dynamic, evolving over time, and is further complicated by variations across populations and pathogen strains. By acquiring rich multiplexed data on these unique immunoreactions over a diverse set of serum specimens, each matched to ground-truth diagnoses from direct detection methods (using reverse-transcription PCR for example)63,64, learning-based point-of-care sensor platforms can provide an unbiased computational method of mitigating false negative and false positive results through a data-driven nonlinear discriminator. Feature selection and inference model optimization methods, similar to the workflow described in Fig. 4, can therefore be used in the design of these urgently needed COVID-19 rapid diagnostic tests. Iteratively converging to an optimal panel of capture antigens using this framework could maximize the diagnostic performance in terms of the area under the curve (for example) while conforming to other test requirements such as the cost-per-test, throughput and total number of allowable parallel immunoreactions in a given testing format41,57. This methodology could even evaluate the relative effectiveness of candidate capture antigens or antibodies amongst a given smart-panel that differ in terms of their synthesis, and as a result may contain unseen structural differences that lead to stronger/weaker affinities and non-specific binding or issues of stability in the testing substrate and assay buffers.

In addition to point-of-care diagnostics, genomic sensing is another field that is very well suited to benefit from feature selection and machine learning-enabled sensor design. The advent of high-throughput DNA sequencers has provided a flood of genomic data necessary for understanding new pathways and possible correlations with disease, among others1,3,65,66. And given the rich history of machine learning approaches in genomics1,47,67,68, there is now an emerging opportunity to impact the various genomic sensing systems that have proliferated over the past decades. For example, base-calling algorithms that utilize neural networks have been implemented to reduce the error rate when inferring base sequences from the often noisy signals generated by nanopore sequencing hardware69,70. Therefore, machine learning-enabled co-design of the nanopore sequencer hardware as well as the assay protocol could potentially be pursued through an iterative learning process with respect to a cost function defined by a combination of the error rate, base-pair bias, and/or sequencing cost per base-pair, which can lead to the joint optimization of the sequencing hardware and assay, together with the base-calling algorithm. Similar learning approaches can also be employed for inferring sequences from fluorescence image stacks generated by sequence-by-synthesis methods, again setting the stage for iterative data-driven co-design of the inference algorithm with the imaging/sensing hardware71. At a higher level, feature selection could also play a role as a design tool to identify short gene sequences, motifs, or mutations at the root of detection for example, or could lead to dedicated genomic sensing systems with more streamlined sensor hardware, unconcerned with the throughput of sequencing and gigabytes of data needed to catalogue a metagenomic sequence. The reagent burden could also be reduced through implementing computationally designed DNA amplification primers or aptamers for diagnostic antigen binding assays or other DNA detection and quantification assays, similar to the previously discussed RNA toehold switches (Fig. 5)42,43,72.

The core principles of feature selection and machine learning-enabled sensor design could also have a major impact on the field of spectral sensing, where the relevant/target information may be distributed unevenly across a measured spectrum. Learning-based feature selection algorithms such as the least absolute shrinkage operator73 along with genetic algorithms and wrapper methods based on standard statistical tests, support vector machines, and neural networks have been utilized in previous works to select an optimal subset of spectral bands for efficient and application-specific sensing74,75,76,77,78. These computational methodologies can naturally be extended to fluorescence-based systems, comprised of a complex mixture of spectrally overlapping, multiplexed exogenous fluorophores for sub-dermal, non-invasive and wearable biosensing applications, among others79. However, in these examples, the feature selection empowers the corresponding inference algorithms without having an impact on subsequent hardware designs that nonetheless could benefit from simplifications such as fewer encoding elements. Feature selection methods would therefore serve as a useful tool for machine learning-based engineering of next-generation computational spectral sensor systems that iteratively converge on application-specific sensing tasks. As a concrete example of this opportunity, we believe that hyperspectral image sensors would significantly benefit from the machine learning-enabled intelligent sensor design framework depicted in Fig. 1, with the spectral encoding elements selected on the basis of their importance in a learning-based spectral reconstruction model (Fig. 6). These ‘elite’ encoding elements that result from iterative feature selection can then be combined into a metapixel that is subsequently patterned across the hyperspectral image sensor plane, similar to the common Bayer filters used in CMOS image sensors, for example. Such an approach, outlined in Fig. 6, could lead to highly specialized designs defined by a cost function that represents a target application of interest, such as environmental sensing, agriculture, biomedical sensing and so on. The relationship between the number of encoding elements (and thus size of the metapixel) and the spectral resolution can also be revealed in the feature analysis, allowing computational engineering of application-appropriate trade-offs among spectral and spatial resolution, specificity and sensitivity.

Fig. 6: Machine learning-enabled intelligent design methodology for a hyperspectral image sensor.
figure 6

The design workflow, mirroring Fig. 1, begins with an array of distinct spectral encoding elements that are designed through approximate electromagnetic transmission models. (1) Encoded spectral information resulting from known incident spectra, Si(λ) is acquired and (2) used to train a reconstruction model (for example, a neural network) that outputs a predicted spectrum, S′. (3) Feature analysis is then performed to select the subset of spectral encoders that serve as the optimal basis for the encoding operation and reconstruction of the spectrum given a user-defined, application-specific cost function. (4) The optimal subset of encoders is used to form metapixels for a hyperspectral image sensor realized via the machine learning-inspired design.

Intelligent design through reconfigurable sensor systems or networks

It is worth noting the strong interest in computational sensing systems that can reconfigure on demand their sensing architecture and/or path/position, for example, to best suit a specific application. The basic principles of machine learning-enabled intelligent sensor design discussed in this Perspective can also be applied to dynamically reconfigure a computational sensing system13. For example, distributed sensing networks for environmental monitoring could autonomously decide where to sample and how to sense (which modality to use). This adaptability could be governed by a supervised learning framework with a concrete sensing goal (for example, a cost function to accurately map hydrocarbons and other pollutants after anthropogenic disturbances) or an unsupervised framework for general discovery or surveillance. As an example, the vibration localization metamaterial sensor shown in Fig. 3a could be reconfigured on the basis of deviations from the ground-truth frequency and the origin of a sensed vibration. Similarly, wearable sensors could greatly benefit from reconfigurable computational sensing designs as a means to optimize signal acquisition for different body types, health states, motion artefacts/activity states and misalignments46,80. For instance, wearable devices (such as the activity monitor shown in Fig. 3c) or other sensor arrays for blood pressure monitoring (for example) could quickly be computationally reconfigured by optimizing the relative weights of different signals within the sensor array to converge to a reliable and accurate readout for photoplethysmogram or electrocardiogram signals, for example. Furthermore, such computational sensing systems, if connected in a widely distributed and cost-effective manner, as part of an IoT network, will have the major advantage of collectively learning ‘on-line’ from evidence-based sensing outcomes, thereby solving and converging to sensing solutions otherwise intractable with a single sensing unit.

Future outlook and conclusions

Computationally designed sensing systems will provide various exciting opportunities, as highlighted earlier. However, like any emerging technology, there exist inherent challenges that must be understood and addressed. Specifically, sensing systems designed by statistical learning approaches inevitably share the well-known pitfalls of machine learning. For example, access to large amounts of rigorously vetted, well-characterized, and diverse training data can sometimes be infeasible for a given sensing system. In the field of biomedical sensing, for instance, where the cost per test can be high, sensing outcomes can depend on a number of factors such as the shelf-life of reagents, ambient conditions (temperature, humidity and so on) and cross contamination, among others. Therefore, it becomes a central challenge to ensure that the training datasets are not biased or severely contaminated by noise sources characteristic of only the training set. Such scenarios would lead to overfitting, where learned sensing algorithms fail to generalize, sometimes catastrophically, upon the introduction of sensing inputs that deviate only slightly from what have already been explicitly learned81. Overfitting can also occur when training datasets are not appropriately diversified in terms of test specimens, dynamic range, resolution or sensor-to-sensor performance variability, for example.

Discussion of some of these challenges may lead the reader to believe that a properly executed learning-based computational sensing approach is prohibitively time and resource intensive for the design phase of a given intelligent sensing system. However, the computational sensor design methods highlighted in this Perspective can always be implemented in the subsequent iterations of the design; that is, they need not be implemented as a first step of sensor prototyping. It is therefore important to emphasize that the presented methods can selectively be applied to the individual components and/or subsystems, depending on the expected gains and practicality of acquiring sufficient training data. In fact, as mentioned earlier, these statistical learning methods are ideal for iterative and adaptive design strategies (Figs. 1, 4, 5 and 6) as they converge on locally optimal, cost-effective solutions for application-specific sensing scenarios, without the need for a complete understanding and modelling of various noise contributions, complex interactions and governing physical laws. For example, a computational sensor that makes a catastrophic sensing error, such as missing a significant event or analyte, or simply missing ‘small data’ related outlier events, could be fixed by readjusting and optimizing its measurement features and their relative weights once such errors are identified. At the iterative design phase of a computational sensor, one can redefine or adjust the cost function of the sensor design to appropriately penalize certain classes of errors that might lead to catastrophic outcomes. The same is true for correcting the failures or errors introduced by new uses of a sensor in new settings or a new region of the world for which it was not initially designed, for example. For correcting and preventing such inference failures in machine learning-enabled computational sensors, the use of physical analytical models as regularizers (through additional terms within the application-specific cost function used for training data-driven sensor inference) could be another strategy to connect and constrain computational sensor designs with the governing physical laws of the core sensing principles and signal transduction mechanisms. All in all, we argue that data-driven computational sensor design approaches provide a scalable, cost-effective and dynamic framework that can be adjusted and improved on the go as new datasets are created, and such computational sensors can therefore learn, evolve and become more robust as they are used more and more.

In conclusion, we envision these emerging computational sensor platforms and the ideas discussed in this Perspective being incorporated into future designs of next-generation sensing hardware. This will result in new sensing systems with various performance advantages realized by perhaps highly non-intuitive designs enabled by machine learning, contrasting with traditional sensor and readout schemes engineered through intuition-driven design choices and/or analytical modelling. This class of computational sensors can therefore enable new and widely distributed applications that are a direct result of the emerging trends in machine learning and the proliferation of big data, impacting various fields (that routinely need/utilize sensors) such as environmental sensing, biomedical diagnostics, global health and security/defence.