1 Introduction

The light microscope provided scientists with a kind of extension of the human eye endowed of the capability of observing details below 0.1 mm offered by human eye physiology and anatomy [1]. Johannes Faber, a fellow of the Accademia dei Lincei, in 1625, in a letter to Federico Cesi, conceived the term microscope writing a sentence that makes clear how the advent of the optical microscope was a turning point in science [2]. “Microscopium nominare libuit” referred to Galileo Galilei’s (“small eyeglass” (occhialino) is clear about the ability of such an instrument to provide a direct view of details before invisible to naked eyes by using a properly shaped piece of glass and the sunlight, the colours of the rainbow. Today the optical microscope permits to design and perform experiments at the nanoscale and beyond since the diffraction barrier is crumbling [3]. We can rephrase the Faber’s sentence into “Nanoscopium nominare libuit” in a way that optical microscopy turns into optical nanoscopy. It is a matter of fact that the road to nanoscopy passes through many stations. It is worth mentioning three of them without loss of relevance for all the others [4]. The first “station” is the one related to the quantitative and fundamental turning point in optical microscopy that occurred, in 1873, when Ernst Abbe (1840–1905) studied the physics of lens construction and its relationship with the performances of the lens in terms of spatial resolution [5]. The most popular relationship to address the spatial resolution performances of an optical microscope in the observation plane (x–y) is the so-called Abbe’s law that represents a heavy stone on the shoulders of any microscopist, Fig. 1. As recently argued by Sheppard [6] “the use of the word ”limit” implies a definite and sudden transition from being resolved to not being resolved, in consideration of the original sentence reported by Abbe “Die physikalische Unterscheidungsgrenze dagegen hängt allein vom Oeffnungswinkel ab und ist dem Sinus seines halben Betrages proportional.” translated as “The physical limit of resolution, on the other hand, depends wholly on angular aperture, and is proportional to the sine of half the angular aperture.” [6, 7]. The lens is the key element of the Abbe’s experiment. In fact, lenses are arguably the most fundamental components of any microscope. This is the reason why the overall performances of the optical microscope are generally referred to the ones of the objective utilized to form images. Lenses are often manufactured from specifically composed glass featuring well defined characteristics as function of the shape resulting in the semiangular aperture, \(\alpha \), of the wavelength of the light being used and the refractive index of the not absorbing propagation medium (n), Fig. 2. Ernst Abbe coined the term “numerical aperture” for the quantity NA = nsin\(\alpha \) [8]. At first glance, the spatial resolution, d, referred to the minimum distance between two objects to be recognized as separate entities in the x–y image plane is approximatively given by:

$$\begin{aligned} d=dx=dy=\lambda /2nsin\alpha =\lambda /2NA \end{aligned}$$
(1)

considering a far-field condition [9]. Along the optical axis, z, the resulting distance for separating them is given by

$$\begin{aligned} dz=2\lambda /nsin2\alpha =8n\lambda /(NA)2 \end{aligned}$$
(2)
Fig. 1
figure 1

Picture of the Monument dedicated to Ernst Abbe in Jena were he worked with Carl Zeiss and Otto Schott at Zeiss Optical Works company. Photo property of Paolo Bianchini

Fig. 2
figure 2

Classical scheme for the numerical aperture function of the refractive index n of the medium and of sin \(\alpha \) , where \(\alpha \) si the semi-angle of angular aperture A of the lens. Image credit to Leica Microsystems, Science Lab webpage, M.Wilson, 2017

This simplified formulation is now enough for our purposes and for the image formation properties we are interested in. The most used simplification is given by considering light in the visible region that for the human eye corresponds to wavelengths from about 380 to 740 nm. In terms of frequency, this compares to a band in the vicinity of 430–770 THz. In practice, considering the phenomenon of diffraction, light cannot be focused to an infinitely small spot, which for visible light, considering transparent objects, like most of the biological molecules, using a glass lens in air, n = 1, water, n = 1.33, or immersion oil, n = 1.52, amounts to 200–250 nm. As a consequence of this physical constraint, kindred objects that are together at a distance closer than 200 nm cannot be discriminated and “minutiae” of the way they are distributed in a certain space can be seen in an image only to a certain degree of visibility. This means that in the visible region of the electromagnetic spectrum one can appreciate, for example, the adhesion of a biological cell to a surface without reaching the ability to see the cellular protein distribution at the molecular scale, Fig. 3. Description of events taking place in biological systems was always one of the main targets of the optical microscope, despite the limited spatial resolution. Robert Hooke (1635-1703) discovered and described in his significant book entitled “Micrographia: or Some Physiological Descriptions of Minute Bodies Made by Magnifying Glasses. With Observations and Inquiries Thereupon” [10] small structural units that decided to name “cells” or “cellulae” . The drawing he reported in a famous illustration of his book refers to “little rooms” , typically with a diameter in the 10–20 \(\upmu {\text{ m }}\) range, Fig. 4. Details inside, like proteins or DNA compaction motifs, could not be seen. However, we can argue that “Micrographia” became a scientific best-seller inspiring a wide public interest in the science of microscopy. Moreover, it is also a matter of fact that a new term, cell, was coined becoming the name for the basic structural, functional, and biological unit of living organisms. The optical microscope allowed to see the biological cells, the “building blocks of life”. After such a “flashback” let’s move to the second “station” of our roadmap. We are now in the fifties when Giuliano Toraldo di Francia (1916–2011) published some seminal papers deserving the credit for the introduction of the concept of super-resolution of images [11, 12]. In his 1955 paper [12] he defined super-resolution as the ability of “imaging” a detail finer than the Abbe resolution limit. Toraldo di Francia brilliantly discussed the technique of pupil plane filters as a method to increase the resolution of an imaging system beyond the diffraction limit, in tune with his considerations in terms of information theory. Giuliano Toraldo di Francia also demonstrated the existence of evanescent waves which are today used in total internal reflection microscopy and were a key point for the nanoscale imaging approach where the exponentially decaying evanescent radiation field of the exciting light is used to probe a very limited region of the specimen, typically the surface [13]. Toraldo di Francia pointed out that “resolving power is not a well-defined physical quantity” [12]. Moreover, he showed that augmented spatial resolution, that is the resolution of visible light microscopy beyond Abbe’s limit—i.e. about 200 nm in the object plane and 600 nm along the optical axis—is possible considering a prior knowledge about the object being observed. Without prior knowledge about the object, there can be no resolution gain when forming an image. In his 1952 paper published in Italian, he showed the effects of super resolving pupils demonstrating the possibility of achieving a resolution enhancement for the central part of the field of radiation-matter interaction while losing resolution in the peripheral parts of the field. The resolution limit, physically driven by diffraction, cannot be beaten. Notwithstanding this, it can be circumvented by adding additional information when forming an image [3]. In the mid of the sixties, among other approaches, Charles W. McCutchen stated that in principle one could construct a super resolving optical system [14], able to distinguish spatial details finer than the diffraction limit. He posed a simple question, namely: “can the diffraction limit for a lens of large numerical aperture be beaten?” and the answer was “yes” but only in specialized and limited applications like the ones implemented late in the nineties in case of fluorescence microscopy. Various solutions have been proposed and put forward within the super-resolution concept. They include special illumination patterns [15, 16] and mathematical approaches [17]. In particular, structured illumination microscopy, SIM, allowed three-dimensional, 3D, imaging with a two-fold increase in spatial resolution [18]. It is both known and obvious that most of the developed techniques require a rather complex hardware and software implementation without “breaking” the diffraction barrier towards unlimited resolution because they are still limited by the inevitable use of lenses and visible wavelengths. The majority of the new architectures and methodologies invented were only able to push diffraction to its very limits. Today we consider that resolution is not a fundamental limit, but instead the fundamental limit is set by concepts of information theory [19, 20], and the concepts of superresolution are, however, somewhat mixed [6]. As a consequence, image formation is a process that has to be analyzed by considering time with localization precision and shaping of the illumination with new detection modalities. We are now ready for the third “station”. In 2014 the diffraction barrier’s sentinel was tricked, and the spatial resolution demonstrated to be theoretically unlimited despite the use of lenses and visible light [21]. The resolution revolution exploded by moving the attention from optics-based solutions to probes, to the sample more in general. In fact, all the 2014 Nobel laureates in Chemistry—Eric Betzig, Stefan W.Hell and William E. Moerner—awarded “for the development of super-resolution fluorescence microscopy”, pioneered the exploitation of the transition between states of different emission properties of the fluorescent molecules, such as between a dark and a bright fluorescent state. This controls the fluorescence emission in such a way that adjacent molecules, fluorescent molecules located at distances shorter than the Abbe limit, are not allowed to simultaneously emit [22]. Time by time, new approaches are explored along with a roadmap that made possible to bring spatial resolution to become “unlimited” passing through the developments regarding optical lenses, information theory and probes for the specimen to be visualized. It’s a matter of fact that fluorescence and spatial resolution were and are the distinctive keywords for the realization and the running progress in optical microscopy. Correlative approaches are the perfect compendium, a “killer” application, for this new season of optical microscopy that we can rename into optical nanoscopy [21, 23]. The next “station”, the fourth “station”, will have as distinctive elements the progress in the development of photodetectors and of artificial intelligence algorithms [4, 24].

Fig. 3
figure 3

This cartoon shows the Abbe’s diffraction limit value related to what we are able to discern in the real world. By eyes, our ability is limited to 500 times the Abbe’s limit. We can see hair, but viruses are invisible to our eyes. Nobelprize.org, Nobel in Chemistry 2014, webpage

Fig. 4
figure 4

Casting the light on the cork texture with a deep plano-convex glass...I could perceive pores, or cells consisting of many little boxes...” is the freely modified description written by Hooke [10] to describe the optical microscopy observation responsible of the use of the term ”cell” since then. Photo property of Alberto Diaspro taken for the Folio edition of [1] using a Leica M10 Monochrom camera

Fig. 5
figure 5

Fluorescence labelling allows to specifically identify subcellular components and molecules. Here in blu is DNA, in green filaments of the cytoskeleton and in red mitochondrial energy storage molecules. the image size is \(28\,\times \, 22\,\upmu {\text{ m }}\)

2 Fluorescence

Fluorescence is the light emitted by molecules via spontaneous decay from an excited electronic state, generated by light absorption [25]. Fluorescent molecules, called fluorochromes, are used for specific labelling of biological macromolecules typically transparent to visible light. This possibility of using the phenomenon of fluorescence to visualize compartments of biological systems pushed optical microscopy to become a key element for research in biophysics and in cellular and molecular biology [26], Fig. 5. Fluorescent molecules allow both spatial and functional information to be obtained through specific absorption, emission, lifetime, anisotropy, photodecay, diffusion, and other contrast mechanisms [25]. Usually, we refer to one-photon excitation, 1PE, when the absorption of a photon of energy \(h\nu \) , h = Planck’s constant and \(\nu \) = frequency of light, takes place, typically supplied by an external source. This brings a fluorochrome from its singlet electronic ground state to an unstable excited singlet state. Fast nonradiative processes reduce the energy of the molecule to the lowest vibrational level of the excited state, and the fluorescent molecule can relax from the ground state emitting a photon of energy \(h\nu '\), this light, having \(\nu <\nu '\) , is called fluorescence [27]. Nevertheless, the molecule has a non-zero probabilty to populate the trilet state by inter system crossin from the singlet excited state. The relaxation from triplet to ground singlet state is called phosophorescence and has a decay time oreders of magnitude slower than fluorescence (from hundreds of \(\upmu \text {s}\) to ms). The molecular energetic states and the transition between levels of the fluorescent molecules being considered can be represented by a so-called Jablonski–Perrin diagram that is a kind of energy level plot of photophysical processes in fluorophores, Fig. 6. The energy spacing between both electronic states and vibrational states surpasses the molecular thermal energies at room temperature; this makes the lowest vibrational level of the ground state S0appreciably populated. For a large number of organic molecules, \(S_{0}\) is an electronic singlet with all spin-paired electrons and excitation by light absorption occurs with no change in spin-pairing and \(S_{0}+h\nu \rightarrow S_{1}\). So, molecules can be excited to one of many vibrational and rotational levels of the electronic state \(S_{1}\), and this gives rise to a whole range of photon energies, wavelengths, called excitation spectrum. The excitation process happens in the fs timescale. It is relevant considering that following light absorption a conversion of vibrational energy to thermal motion undergoes a very fast nonradiative relaxation in which some excitation energy is dissipated causing the frequency shift of emitted photons at lower values. This internal conversion, I.C., or vibrational relaxation takes place on the ps timescale. So, fluorescence emission occurs from the lowest vibrational level of the first excited singlet state S1producing a photon. In honour of the American spectroscopist Michael Kasha (1920–2013), this is also known as Kasha’s rule, a principle in the photochemistry of electronically excited molecules, stating that photon emission, fluorescence or phosphorescence, occurs in appreciable yield only from the lowest excited state of a given multiplicity [28]. However, photon emission allows the fluorophore to relax to one of the several vibrational levels of the electronic ground state. This last step, \(S1\rightarrow S0+h\nu +I.C.\), happens at the ns scale and is the one influencing the detection process in fluorescence optical microscopy. The range of energies that produces the fluorescence emission spectrum is due to the difference among the vibrational energies in the ground state. It is worth noting that fluorescence emission from the lowest vibrational level of \(S_{1}\) results in the fact that the emission spectrum is independent of the excitation wavelength. Figure 7 shows the situation for a specific fluorescent molecule. The red shift of the emission spectrum, due to the loss of energy by internal conversion, results in a difference between the maxima of the excitation and emission spectra that is known as Stokes’ shift [29]. The fluorescence intensity decays exponentially according to:

$$\begin{aligned} I(t)=I_{0}e^{-t/\tau } \end{aligned}$$
(3)

where I(t) is the fluorescence intensity at time t, \(I_{0}\) is the fluorescence intensity at time zero, and \(\tau \) is the time when the fluorescence has dropped to \(I_{0}/e\), termed fluorescence lifetime [30]. Considering a fluorescent rate constant \(k_{f}\) , the fluorescence lifetime is given by:

$$\begin{aligned} \tau =\frac{1}{k_{f}} \end{aligned}$$
(4)

The fluorescence lifetime of a fluorophore is related to the probability of the molecule existing in \(S_{1}\), that is the average time spent in the excited state. However, there are relaxation processes from \(S_{1}\) that do not produce fluorescence emission. We have to consider a global rate constant at room temperature, \(k_{f}+ k_{nf}\) , where \(k_{nf}\) is related to all non-fluorescent relaxation pathways. We can rewrite Eq. (4) in the following way:

$$\begin{aligned} \tau =\frac{1}{(k_{f}+k_{nf})} \end{aligned}$$
(5)
Fig. 6
figure 6

A schematic view of the molecular processes following energy absorption through timescales and pathways defined as Perrin–Jablonski diagram

Fig. 7
figure 7

Specific absorption and emission spectra of different fluorescent molecules [25]

Fluorescence lifetime is sensitive to the environment and to perturbations in the proximity of the fluorophore [31]. Proteins, fluorophores, molecules, ions and pH that are at a distance of few nanometers could increase the contributions of non-radiative components by, for instance, energy transfer. Thus it results in a faster lifetime that depends on the concentration and characteristic composition of the medium surrounding the fluorophore. Figure 8 shows an image reporting about a temporal map of lifetimes, point by point, in a fluorescent sample. In this case, although endogenous fluorescence has a very similar spectral emission of enhanced green fluorescent protein (EGFP) the lifetime is different. EGFP is a very stable fluorescent protein [32], protected by its beta-barrel structure it is less sensitive to metabolic processes keeping a quite constant lifetime in the whole blood vessels network. The efficiency of fluorescence is defined through the its quantum yield, QY, the ratio between the number of photons emitted as fluorescence and the number of photons absorbed by the molecule.

$$\begin{aligned} QY=\frac{N_{em}}{N_{abs}}=\frac{k_{f}}{\left( k_{f}+k_{nf}\right) } \end{aligned}$$
(6)

Fluorophores normally used in fluorescence microscopy have QY values in the range of \(\sim 0.05 \, \mathrm{to} \, \sim 0.9\) [33]. A fluorophore is considered bright for \(QY>0.1\). The EGFP for instance has a \(QY\sim 0.6\), while fluorescein, that is one of the most used dye, has \(QY\sim 0.93\) [34]. The intensity of fluorescence \(I_{f}\), related to the ability of a certain concentration C molecules to generate fluorescence, can be simply related to fluorescence cross-section \(\sigma \) , quantum yield QY and illumination intensity \(I_{0}\), as:

$$\begin{aligned} I_{f}\propto \sigma \cdot C\cdot QY\cdot I_{0} \end{aligned}$$
(7)

suggesting that collectable intensity of fluorescence scales linearly with illumination intensity. While this is true at moderate intensities, it is no longer valid at high intensities when internal dynamics of the fluorescent molecules have to be taken into account as in the special non-linear case of two-photon excitation. The factor \(\sigma \), the cross-section of the fluorescent molecule, is a reporter of the active area of the molecule available for the absorption process. The absorption cross-section of a molecule, considering a one-photon excitation referred as 1PE, can be estimated from its dipole transition length as \(\sigma \sim 10^{-16}-10^{-17} \mathrm{cm}^{2}\) for a transition length of approximatively 1, typical for aromatic rings [35]. In the next sections, we will consider three-aspects that, in different ways, have a role for the development of those advanced and super-resolved fluorescence methods that transformed microscopy into nanoscopy, namely: (1) photobleaching that is a phenomenon linked to the fluorescence process, (2) the possibility of two-photon excitation, 2PE, that has immediate consequences in the design of modern fluorescence optical microscopes and (3) fluorescence resonance energy transfer, FRET, is the most important attempt to get molecular information at the nm scale in optical microscopy before the advent of optical nanoscopy.

Fig. 8
figure 8

Zebrafish embryo. Green: Blood vessels, EGFP / Magenta: Nuclei, Hoechst. Blood vessels labelling identified using TauContrast; the lifetime-derived information distinguishes the blood vessels signal from endogenous signal contributions. Courtesy Laia Ortiz Lopez and Julien Vermont, IGBMC, Strasbourg. Lifetime information on the right increases the image content of the “classical” image by using TauContrast on Leica STELLARIS 8

2.1 Photobleaching

Since fluorescence is a fast process, in order to form an image, one needs to iterate the excitation process within an appropriate observation window of the related emission. This makes the process of forming an image based on fluorescence a cyclic process considering that, after the photon emission, the fluorophore relaxed back to the ground state is subjected to a further excitation by light absorption. Due to its nature, fluorophores exhibit a limited number of feasible cycles. Excited fluorescent molecules take part in chemical reactions implying photochemical modifications of the fluorophore. This can conduct to the irreversible loss of its ability to fluoresce. This phenomenon is called photobleaching [36]. Photobleaching is a natural circumstance that produces a decrease in the fluorescence intensity emitted over time. This has an immediate effect on the image formation over time, affecting the whole image or element by element, say pixel by pixel. Under high-intensity illumination conditions, the irreversible destruction of the excited fluorophore is the main factor limiting fluorescence detectability. The degree of photobleaching is dependent on both the duration and intensity of exposure to the excitation light, Fig. 9. The very complex chemical mechanisms of the bleaching process are not well understood. Different photochemical reaction pathways seem to be responsible for photobleaching, including reaction between adjacent molecules. The main causes seem to involve photodynamic interactions between excited fluorophores and reactive oxygen species dissolved in the sample medium, particularly molecular oxygen in its triplet ground state. This means that exposing a fluorophore to a high level of illumination intensity in the presence of molecular oxygen causes irreversible changes that render the molecule no longer fluorescent. Therefore, the photobleaching rates depend both on the nature of the fluorophore itself and, to some extent, on the molecule’s environment. Now, one of the most used fluorescent molecule like fluorescein, at excitation intensities of \(10^{23}-10^{24}\) photons \(\times \) cm\(^{-2}\) s\(^{-1}\) and considering a 488 nm wavelength, emits approximately \(3\,\times \, 10^{4}\) to \(4 \, \times \, 10^{4}\) photons before being irreversibly bleached. Photobleaching rate and the fluorescence intensity decay measured as a function of the excitation power for linear excitation of fluorescein increased with a slope of 1 for linear excitation, the so-called one-photon excitation, 1PE [36]. Under a constant absorption of light, the fluorescence intensity decays following an exponential law:

$$\begin{aligned} I(t)=I_{0}e^{-Kt} \end{aligned}$$
(8)

Where \(I_{0}\) is the initial fluorescence intensity and K the fluorophore bleaching rate constant. This fact has relevant consequences in quantitative optical microscopy [37]. Notwithstanding this, the photobleaching phenomenon can be exploited to understand diffusion mechanisms at the molecular level by evaluating the fluorescence recovery after photobleaching [38]. At the single molecule level, one can take advantage of the ability to control the switching on and off of molecules [39]. Moreover, there are important families of fluorescent molecules that allow photoswitching, photoconversion and photoactivation [40]. This results in the ability to switch in a controlled way from a silent state, a dark state, to a bright one or to switch emission across a number of emission wavelengths [41].

Fig. 9
figure 9

Photobleaching process of three probes, bound to DNA (DAPI, blue), mitochondria (MitoTracker Red CMXRos, red), and actin filaments(BODIPY-FL phallacidin, green) in fixed bovine pulmonary artery endothelial (BPAE) cells. Two-photon excitation was provided by a Leica TCS SP2 AOBS confocal microscope coupled to a Chameleon—XR (Coherent, Santa Clara, CA) ultrafast tunable titanium–sapphire laser through the IR port. The pulse width was increased to picoseconds from 140 fs, 760 nm wavelength, 20mW average power at the entrance of the scanning head. Temporal series was acquired at a rate of 729 ms per image every 6 s. Image dimensions are 512 \(\times \) 512 pixels over 80 \(\times \) 80 \(\upmu \mathrm{m}^{2}\) . (Images collected by Paolo Bianchini, Department of Physics, University of Genoa.) Adapted from [36]

2.2 Two-photon excitation fluorescence

Another possibility for bringing a fluorescent molecule to the excited state is given by the two-photon excitation process. Two-photon excitation, 2PE, of fluorescent molecules, including higher orders summarised in the term multiphoton excitation, MPE, dates back to the end of the 1920s having its roots in theory originally developed by Maria Goppert–Mayer in her Ph.D. thesis dissertation [42] and experimentally demonstrated after the advent of laser sources in the 1960s [43]. Its possible utilization in microscopy was demonstrated in two seminal papers reporting about the realization of a nonlinear scanning microscope. The one developed by Colin Sheppard in Oxford at the end of the 1970s [44] was followed at the beginning of the 1990s by the one developed at the Watt Webb laboratories by Winfried Denk, David Piston and colleagues [45]. The former anticipated the possibility of using 2PE in optical microscopy while the latter reported for the first time about a tangible and efficient use of 2PE to form images of biological specimens [46]. So far, two-photon excitation of fluorescent molecules is a nonlinear process related to the simultaneous absorption of two photons whose total energy equals the energy required for conventional single-photon excitation. Intuitively, as illustrated in Fig. 10, the “first” photon brings the molecule to an intermediate state K, and the “second” photon completes the excitation process delivering the molecule to the final state. Because the intermediate state can be a superposition of molecular states instead of an eigenstate of the molecule, it is usually referred to as the virtual intermediate state. Therefore the two photons should arrive simultaneously (\(\sim 10^{-17}\,\mathrm{s}\)) to undergo excitation. Such two photons can be of wavelengths \(\lambda _{1}\) and \(\lambda _{2}\) within the constraint that

$$\begin{aligned} \lambda _{1p}\cong \left( \frac{1}{\lambda _{1}}+\frac{1}{\lambda _{2}}\right) ^{-1} \end{aligned}$$
(9)

where \(\lambda _{1p}\) is the wavelength needed to prime fluorescence emission in a one-photon absorption event. For practical reasons one has wavelengths \(\lambda _{1}=\lambda _{2}\) and the starting experimental choice is \(\lambda _{1}\cong 2\lambda _{1p}\) [47]. This means that a molecule able to fluoresce at 420 nm in the blue, when excited at 340 nm in the ultraviolet region of the electromagnetic spectrum, can be excited at 680 nm, in the red, under two-photon excitation. These specific wavelengths are reported to emphasize the shift from the ultraviolet to the red region to illuminate the specimen. This, for a biological sample, means less overall phototoxicity and the possibility of penetrating in tissues when forming an image [48].

Fig. 10
figure 10

1PE (a) and 2PE (b) absorption and fluorescence processes using a Perrin–Jablonski diagram

Now, referring to Eq. (7) related to the 1PE case, we can introduce the 2PE molecular cross-section as the propensity of a fluorescent molecule to absorb simultaneously two photons having certain energy or wavelength. Moreover, as the collisions of two or more photons with the very same molecule can be considered as statistically independent events, 2PE is a process that has a quadratic dependence on the instantaneous intensity of the excitation.

Now, even if the 2PE transition consists of contributions from all eigenstates as possible intermediate states, the single intermediate state approximation can be used to give an estimate of the order of magnitude of the two-photon absorption cross-section \(\sigma _{2}.\)

$$\begin{aligned} \sigma _{2}=\sigma _{ij}\sigma _{jf}\tau _{j} \end{aligned}$$
(10)

where \(\sigma _{ij}\) and \(\sigma _{jf}\) are the appropriate one-photon cross sections for the initial, i, intermediate, j, and final, f, states. \(\tau _{j}~\) is the intermediate-state lifetime which is responsible for the timescale.

As a first approximation one can use the values reported for the fluorescence 1PE process. The resulting two-photon absorption cross-section value is \(10^{-49}\,\mathrm{cm}^{4}\,\mathrm{s}/{\text {photon}}\). Such an estimate can be extended to the MPE case scaling Eq. (10) obtaining values of \(10^{-83}\) and \(10^{-115}\) for the 3PE and 4PE cases, respectively, in the scaled dimensions.

Now, the fs time scale for simultaneity is in the temporal window of molecular energy fluctuations at photon energy scales. Here, to have an idea of the rarity of the event, one should consider that under a bright daylight condition a good 1PE or 2PE fluorescent molecule absorbs a photon through one photon interaction about once a second and a photon pair by two-photon simultaneous interaction every 10 millions of years. This practically means that one needs a very high density of photons delivered to biological specimens under low perturbation conditions. The typical photon flux densities are of the order of \(>~10^{24}\) photons \(\mathrm{cm}^{-2} \, \mathrm{s}^{-1}\), which implies intensities of around MW-TW \(\mathrm{cm}^{-2}\). To avoid damaging the sample but reach the necessary photon fluxes the most used light sources are ultrafast pulsed lasers, e.g. ultrafast Ti:sapphire laser with a typical repetition frequency of 80 MHz and pulse width of about a hundred of fs. When the molecule reaches the excited state, its relaxation to the ground state is following the same rules discussed for 1PE. This non-linear way of bringing the molecule to the excited state has peculiar and extremely interesting consequences in the design and utilization of a 2PE microscope [49].

2.3 Fluorescence resonant energy transfer (FRET)

The fundamental aspiration of getting information at molecular level is an engine, a real motor, for the development of optical microscopy approaches. Dealing with fluorescent molecules is comparatively easy and immediate to take advantage of the physical process of resonant energy transfer, RET, that is implemented as FRET in fluorescence, F, optical microscopy. This physical process is also known as “Förster resonance energy transfer” in honour of Theodore Förster (1910–1974), who first developed the quantum theory of singlet-singlet energy transfer [50].

The quantum-mechanical process of FRET indicates that the energy of a fluorophore in an excited state can be transferred to the ground state of a fluorophore or chromophore located in the close vicinity. The former fluorophore is called the donor and the latter acceptor. Now, this non-radiative energy transfer, due to dipole-dipole interactions, produces quenching of the donor fluorescence emission and shortening of the donor fluorescence lifetime [29].

The efficiency of this energy transfer is reasonably detectable when, at least, three conditions occur, namely: (1) the emission spectrum of the donor significantly overlaps the excitation spectrum of the acceptor; (2) the distance between donor and acceptor is within 1–10 nm of each other; (3) donor and acceptor transition dipole moments are properly aligned with respect to one another, Fig. 11. The FRET efficiency EF can be expressed both in terms of the distance between donor and acceptor and in terms of donor lifetime:

$$\begin{aligned} E_{F}=\frac{R_{0}^{6}}{(R_{0}^{6}+r^{6})} \end{aligned}$$
(11)

and

$$\begin{aligned} E_{F}=1-(\frac{\tau _{da}}{\tau _{d}}) \end{aligned}$$
(12)

where r is the distance between the donor and the acceptor; \(\tau _{da}\) is the donor lifetime in the presence of the acceptor and \(\tau \) d in the absence of acceptor, respectively. The Förster radius, \(R_{0}\) is the distance at which half the excitation energy of the donor is transferred to the acceptor, i.e., the FRET efficiency \(E_{F}=0.5\). For \(r= R_{0}\), the energy transfer rate equals the sum of the rates of the other relaxation process of the donor. According to Förster theory [29],

$$\begin{aligned} R_{0}=0.211\{K^{2}n-4Q_{D}J(\lambda )\}^{\frac{1}{6}} \end{aligned}$$
(13)

where \(Q_{D}\) is the quantum yield for the donor in the absence of acceptor, \(K^{2}\) is the orientation factor describing the relative orientation between the donor and acceptor dipoles, n is the refraction index of the medium, and \(J\left( \lambda \right) \) is the overlap integral that accounts for the degree of spectral overlap between the donor emission and the acceptor absorption spectra.

As a result of the energy transfer process, the excited acceptor may relax to the ground state by fluorescence emission. This clearly demonstrates that a FRET efficiency map reports about distances between molecules of the order of \(1R_{0}\) (few nanometres), Fig. 12. Moreover, a dynamic situation exhibits a temporal switch of emission between donor and acceptor as a function of time changes of mutual distances. FRET imaging circumvents the physical limitations imposed by diffraction introducing the additional data set provided by the knowledge of mutual fluorophore interactions.

Fig. 11
figure 11

The efficiency of FRET process (a), FRET, has a strong dependence on the molecular distance r and on the Forster radius R0. Overlap of the emission spectra (b), the mutual orientation of the fluorophores (c) and value of the Forster radius (d) are peculiar properties of the fluorescent molecules influencing the resonant energy transfer phenomenon

Fig. 12
figure 12

Optical map of molecular distances in situ for chromatin DNA using FRET. A pair of DNA binding dyes, Hoechst 33342 and Syto 13, can be used as a FRET system to map chromatin compaction within live cell nuclei. Colours allow to encode and decode distances using a corrected FRET ruler\(A_{0}^{*}\). We defined \(A_{0}=E_{0}/1-E_{0}=A_{0}^{*}/(\beta _{D}/\beta _{A})\) where E is the FRET efficiency and \(\beta \) is the brightness of the respective fluorophores [51]

3 Spatial resolution

The term spatial resolution indicates the theoretical and experimental limitations, in this specific case, of the optical microscope to study in details organization of matter using a lens and light to prime the interaction light-matter and to record the effects of such interaction. The process is analogous to the one used by our eyes to get information about the real world impinged by the sunlight. However, resolution is not a fundamental limit. The limit is set by concepts of information theory [20]. The well-known Abbe resolution limit is a sharp limit, Eq. (1), to the imaging of a periodic object such as a grating. Optical nanoscopy demonstrates far-filed optical resolution beyond the Abbe limit in a fluorescence-based mechanism of image formation. This implies that the diffraction limit does not represent a fundamental limit to optical resolution. Today, it is a matter of fact the possibility of finding the subwavelength information from the far field of an optical image that per se usually prevents the squeezing of light to a dimension smaller than the used wavelength. The modern approach to imaging resolution, since the seminal papers by Toraldo di Francia, Lukoz and Sheppard, combines the physics of wave phenomena and the methods of information theory [12, 16, 19]. The presence of noise in the image formation system, the number of degrees of freedom, the modelling of the information channel have an essential information-theoretical nature, linked to the Shannon’s theory of information (in linear systems). In the next paragraphs, we will discuss some popular implementations of the optical microscope that, due to the different way they are designed, offer variations on the theme under the very same common denominator of using lenses and light. They are confocal microscopy based on the rejection of some information using a point by point illumination and detection scanning strategy and two-photon excitation microscopy that obtain the confocal localization effect by exploiting a different interaction with the imaged probe. It is worth noting that such approaches, operating under point scanning conditions, answer the question posed by Lukosz [16] what would you sacrify to improve resolution? Point by point scanning approaches sacrify the field of view during the interrogation of the sample. The additional information, in the Toraldo Di Francia view [12], is related to the additional information given by the position of the scanning beam known with a precision higher than the diffraction limit. Another similar case is treated in the “correlative nanoscopy” section. The atomic force microscope spatial resolution is determined by the physical size of the probe and by the additional information of the position of the tip [52]. One can see all this in terms of degrees of freedom and information channel [19]. However, for our purposes here, as a kind of reference, we consider the classical wide-field optical microscope. In principle, it is the only one for which it is classically possible to use the term resolution since the other methods create a sort of “delay” in the collection of data that form the final image. Super-resolution refers to overcoming this resolution limit in the different ways we will discuss in the “optical nanoscopy” section. However, there are many ways of specifying resolution [53]. Figure 13 reports the differences among the most common criteria. Again, when one can consider the performances of the microscope in terms of information channel in the spatial frequency domain, the optical transfer function (OTF) of the system is used within the classical model of the microscope in terms of space invariant linear system, Fig. 14.

Fig. 13
figure 13

Cartoon representation of the most popular spatial resolution criteria. The top row shows the intensity profile of two pointlike light sources as collected by a lens. The bottom row shows the different criteria for the spatial resolution distance. Rayleigh limit is more relaxed than the others. Differences are of the order of approximatively 20\(\%\) of the collected intensity

Fig. 14
figure 14

Top: high (green) and low (blue) spatial frequency wave. Bottom: corresponding Fourier transform of the low (1) and high (1/ \(\Delta \) min) frequency wave. Modified from Nobelprize.org, Nobel in Chemistry 2014, webpage

Here we summarize the results that one obtains using the main resolution criteria for the fluorescence case [54], namely:

(1) Cut-off of transfer function. This is the basis of the Abbe’s theory. Disadvantage: the strength of the transfer function is also important. For confocal, bandwidth is doubled, but the resolution is not doubled. Considering the ratio among resolution distances d as in Eqs. 1 and 2, same for transverse or axial for wide-field, confocal and two-photon case one gets 1:0.5:1. The two-photon case assumes the same emission wavelength. Two-photon is twice the confocal case, as usually twice the wavelength. (2) Two-point resolution. This is the basis of the Rayleigh resolution criterion. Disadvantage: difficult to calculate in many cases. The ratio of distances, same for transverse or axial gives 1:0.760:1.520, in the order given before. (3) Full width half maximum (FWHM) of point spread function (PSF). This is easy to calculate. Transverse FWHM of PSF results in the following approximately values for the ratio of distances: \(1{:}0.707=\frac{1}{\sqrt{2}}{:}1.414=\sqrt{2}\) . This only approximate as the shape of the PSF is different for confocal and exact figures for paraxial, scalar theory is the following for the different optical cases:

Conventional fluorescence:

$$\begin{aligned} d=\frac{0.514\lambda }{nsin\alpha } \end{aligned}$$
(14)

Confocal fluorescence (small pinhole, no Stokes shift):

$$\begin{aligned} d=\frac{0.370\lambda }{nsin\alpha } \end{aligned}$$
(15)

Two photon fluorescence (same emission wavelength).

$$\begin{aligned} d=\frac{0.739\lambda }{nsin\alpha } \end{aligned}$$
(16)

So, in terms of the ratio of distances, one gets 1:0.720:1.440. For sin \(\alpha =1\), for the conventional case, \(d=0.514\lambda \)/n , close to \(d=\lambda /2n\) , as given by Fraunhofer in 1821. The axial case, FWHM for PSF, is a bit more complicated. In the transverse case, the cut off of the OTF, for fluorescence goes as \(4nsin2\left( \alpha /2\right) \). For small \(\alpha \) this becomes \((nsin2\alpha )\) [55]. Now, as \(\alpha =\pi /2\), it becomes 2n. This complication doesn’t affect the ratio of the distances, only the dependence on \(\alpha \) . The ratio of distances for the axial case turns out to be the same as for transverse, 1:0.720:1.440. For small numerical apertures, NAs, \(4sin2(\alpha /2)\approx sin2\alpha \) :

Conventional fluorescence:

$$\begin{aligned} d=\frac{1.77\lambda }{n\left( \text {4 sin}^{2}\alpha /2\right) } \end{aligned}$$
(17)

Confocal fluorescence:

$$\begin{aligned} d=\frac{1.28\lambda }{n\left( \text {4 sin}^{2}\alpha /2\right) } \end{aligned}$$
(18)

Two-photon fluorescence:

$$\begin{aligned} d=\frac{2.55\lambda }{n\left( \text {4 sin}^{2}\alpha /2\right) } \end{aligned}$$
(19)

For high numerical apertures, \(2sin2(\alpha /2)\approx 1\)

Conventional fluorescence:

$$\begin{aligned} d=\frac{0.866\lambda }{n\left( \text {2 sin}^{2}\alpha /2\right) }\approx \frac{0.866\lambda }{n} \end{aligned}$$
(20)

Confocal fluorescence:

$$\begin{aligned} d=\frac{0.638\lambda }{n\left( \text {2 sin}^{2}\alpha /2\right) }\approx \frac{0.638\lambda }{n} \end{aligned}$$
(21)

Two-photon fluorescence:

$$\begin{aligned} d=\frac{1.276\lambda }{n\left( \text {2 sin}^{2}\alpha /2\right) }\approx \frac{1.276\lambda }{n} \end{aligned}$$
(22)

3.1 Fourier ring correlation

The quantitative evaluation of the achieved spatial resolution using the optical microscope is a fundamental step to decipher what an image tells us. The relationships reported in the previous section provide an important starting point for designing an experiment that utilizes an optical microscope to answer a scientific question. At the experimental stage, the spatial resolution value is usually evaluated with the following methods that have some critical issues. Image formation modelling [56] or analytical formulation [55] of the process with the aim of taking into account physical limitations, optical aberrations/distortions, noise and sample conditions is hard to derive making difficult the comparison of different images [56]. As we have discussed in the previous paragraph, there are different ways to consider the resolution issue.

One should consider the image formation process as a linear and space invariant transformation having as Dirac’s function impulse response the point spread function (PSF) of the system [57]. Within such a linear and space invariant model, the knowledge of the point spread function completely characterizes the imaging system. Unfortunately, such an approach does not take into account possible misalignment overall affecting the formed image. More recently, the use of a calibration sample like DNA origami [58] allows to have a tunable calibration ruler available down to the nanoscale, to take into account both optical aberrations/distortions and sample conditions related to the photophysical properties of the fluorescent probe being used. Imaging such a ruler does not coincide with the imaging process of the real sample.

Another way for evaluating spatial resolution is related to measure the intensity distribution along the line-profile crossing sub-resolved sized structures, sub-resolution pointlike objects or thin filaments. From the intensity profile analysis, one can infer the spatial resolution. This is the most popular approach for its simplicity even if it does not match with the definition of spatial resolution as the minimum distance between two point structures that can be resolved and is hampered by the not completely objective choice of the line profile to be examined.

Considering the previosuly commented methods including possible drawbacks, a robust and quantitative solution can be found using the Fourier-ring-correlation (FRC) method [59]. FRC analysis is based on the direct evaluation of the spatial cut-off frequency of the acquired images. Ideally, a sample, from coarse to fine features, contains an infinite number of spatial frequencies, and the microscope performs as a short-pass filter where the cut-off frequency is the ultimate resolution it can achieve. Therefore, FRC allows estimating the effective spatial resolution of an imaging system without the need of any a-priori information, such as an analytical model, taking into account all optical conditions (aberrations, distortions and misalignments), noise conditions and sample conditions. It does not request a calibration sample or the definition of an arbitrary line profile in specific regions of the image.

Fourier ring correlation measures the normalized cross-correlation coefficient between two images over corresponding rings (qi) in the Fourier space. FRC is a function of the spatial frequency \((q_{i})\) and is calculated as follows:

$$\begin{aligned} FRC\left( q_{i}\right) =\frac{\sum _{f_{x},f_{y}\in q_{i}}^ {}G_{1}\left( f_{x},f_{y}\right) \cdot G_{2}^{*}\left( f_{x},f_{y}\right) }{\sqrt{\sum _{f_{x},f_{y}\in q_{i}}^ {}\vert G_{1}\left( f_{x},f_{y}\right) \vert ^{2}\sum _{f_{x},f_{y}\in q_{i}}^ {}\vert G_{2}\left( f_{x},f_{y}\right) \vert ^{2}}} \end{aligned}$$
(23)

where \(G_{1}\) and \(G_{2}\) are the discrete Fourier transforms of the two images \(g_{1}\) and \(g_{2}\), \(f_{x}\) and \(f_{y}\) are the spatial frequency coordinates. This demands for two “identical” but statistically independent images of the same sample and a careful analysis of the noise threshold. Figure 15 shows the FRC procedure to get a quantitative value of the cut-off frequency following a simple procedure: (1) two identical and statistical independent images are collected, (2) two-dimensional Fourier transform is calculated, (3) correlation analysis is used for evaluating FRC(q), (4) the plot of FRC(q) allows the cut-off frequency to be evaluated by intercepting the noise threshold line. The inversion of the resulting value produces the value of the effective spatial resolution attainable by the image formation system being used, including all the actors, from optics to sample characteristics. In order to obtain two identical images under the condition of independent noise realizations, some practical approaches can be carried out, namely [60]: (a) frame-based acquisition: two consecutive measurements are performed, and a drift-correction applied; (b) line-based acquisition: every line is raster-scanned twice, and the two different images are formed by considering even and odd lines; (c) pixel-based acquisition: the pixel dwell-time of every pixel is split into two temporal windows having the same duration to obtain two independent images at the end of a single scan. It is worth noting that in the case of time-dependent phenomena, such as diffusion, it is possible to isolate “fake” spatial frequencies. This aspect mainly applies to the case a). In general, the FRC approach is a robust and effective way to determine the frequency content of the collected data set used to form an image.

Fig. 15
figure 15

Two statistically independent images to feed the Fourier ring correlation algorithm. The effective cut-off frequency is at the intercept with the threshold noise providing the global spatial resolution

4 Advanced optical microscopy

The modern optical microscope has to give an answer to increasing demand for super-resolution and three-dimensional, 3D, access to the spatial information. 3D examination at the molecular level of biological structures in a hydrated allows preserving physiological states while forming images that are distinctive and exclusive when compared with other high-resolution techniques using light or other probes [61]. This permits the study of the complex and delicate relationships existing between structure and function in biological systems. One relevant step, in terms of progress in 3-D optical microscopy, was the invention of the confocal microscope in its different solutions following the brilliant idea of confocality implemented by Naora and colleagues to reduce background noise in a spectroscopic set-up designed for fluorescence studies [62]. Later, Minsky, in 1957, invented and patented a confocal microscope identical with the concept extensively developed by Egger and Davidovits at Yale, by Sheppard and Wilson at Oxford, and by Brakenhoff et al. in Amsterdam [53]. It was in the mid-1970s, with the advent of affordable computers and lasers, and the development of digital image processing software, that confocal laser scanning microscopes entered in scientific research mainly applied to biological and material specimens [63]. However, two other steps were fundamental for the development of advanced optical microscopy approaches, namely: technological advances in scanning systems and implementation of digital image processing algorithms towards computational optical sectioning microscopy and image restoration applications. The former has its seminal paper in a work published in 1974 by Quate and Lemons reporting about the development of a mechanically scanned acoustic microscope performing \(10\,{\upmu }m\) spatial resolution. Using a single-surface lens, an acoustic beam is focused with a negligible spherical aberration in a water embedded cell. The image is formed by mechanically scanning the specimen through this focused beam adopting a raster pattern. Transmitted power is detected by means of a piezoelectric transducer, and this signal modulates the synchronized raster formation of the image on a display. Piezoelectric detection is the key to getting a sensitivity of \(10^{-8}~\mathrm{W}/\mathrm{cm}^{2}\), yielding images of excellent clarity and contrast [64]. Such an approach is the core of confocal and two-photon laser scanning microscopy and has been exported to scanning probes microscopy methods like scanning tunnelling and atomic force microscopy. The latter, computational optical sectioning microscopy, is based on advances in digital image processing that had their “killer realization in a paper by Agard and Sedat [65] dealing with the three-dimensional (3D) chromosome topography in an intact nucleus determined using fluorescently stained Drosophila polytene chromosomes, optical fluorescence microscopy and a newly developed, generally applicable, cellular image reconstruction algorithm based on the best estimation of the plane by plane intensity distribution calculated from an optically sectioned series of images of the specimen [66].

4.1 Three-dimensional optical sectioning

It is a matter of fact that the possibility of a 3D reconstruction of an object starting from the acquisition of two-dimensional, 2-D, datasets made by consecutive optical slices, is challenging and powerful towards morphological analysis and volume rendering. The gentle procedure of optical sectioning to get information from different planes of the specimen without being invasive allows preserving structures and functions. A wide-field microscope, Fig. 16 can be used to collect a series of 2D images of the specimen by moving the focal plane across the 3D specimen. A set of 2D images is acquired at various focus positions along the z-axis, Fig. 17. The result of such e collection can be arranged in separate views or in a single view that considers all the information collected in the 3D volume, Fig. 18. The blurred image is due to the fact that the fluorescent molecules of the specimen undergo full excitation on every instant, leading to in- and out-of-focus light points contribution overlapping, worsening axial resolution, and producing that typical hazing in the collected images that, together with the light-diffraction effects, limits the instrument performances [67]. So far, the 3D shape of the specimen is buried in a kind of sea of defocused images that are affecting the possibility of analysing structural motifs. Under certain conditions, one can recover the 3D shape of the object [68]. The observed image o(x, y, z), produced by the true intensity distribution i(x, y, z), is corrupted by the characteristic point spread function of the image formation system, s(x, y, z), by defocused information coming from adjacent planes and, to a first approximation, by additive noise stemming from different sources n(x, y, z) At a certain plane of focus \(z_{0}\) within the sample or, optical sectioning along the z-axis, at a plane j over N, considering a 3D specimen, can be regarded as:

$$\begin{aligned} o\left( x,y,z\right) _{j}=i\left( x,y,z\right) _{j}s\left( x,y,z\right) _{j}+\sum _{k\ne j}^ {}i\left( x,y,z\right) _{k}\otimes s\left( x,y,z\right) _{k}+n\left( x,y,z\right) \end{aligned}$$
(24)

where the subscripts on i(xyz), o(xyz) and s(xyz) refers to the discretized z plane named as j, the plane in focus, and k-planes all the others, out-of-focus. Equation (24) is usually transferred to the Fourier frequency domain, where the convolution operator, \(\otimes \), becomes an algebraic multiplication. Image restoration algorithms, deconvolution, are used to calculate the best estimate of i(xyz) plane by plane, Fig. 19. Some further corrections are performed, accounting for axial distortion phenomena linked to the significative refractive index mismatches due to sampling thickness that becomes relevant for objects larger than \(50\,\upmu \mathrm{m}\) [69].

Fig. 16
figure 16

Scheme of a classical wide-field fluorescence optical microscope

Fig. 17
figure 17

Optical sectioning is referred to a starting in-focus image plane, j, within a set of out-of-focus planes, k. When the focus of the lens is placed in the plane j, blurred information can come from adjacent planes. Sucha contamination can be removed by image processing, confocal imaging or two-photon excitation microscopy

Fig. 18
figure 18

Images representative of the collection of optically sectioned data from two differently organized clusters of fluorescent molecules. Discerning details require some image processing [63]. Image courtesy of Hans van der Voort, Huygens Software, Scientific Volume Imaging, NL

Fig. 19
figure 19

Image restoration of Fig. 18 blurred clusters [69] enables the visualization of organizational motifs hidden before—image courtesy of Hans van der Voort, Huygens Software, Scinetific Volume Imaging, NL

The solution to the problem is simplified when using confocal laser scanning or two-photon excitation microscopy [49]. This is due to the cancellation of the defocused contributions arising from the adjacent k planes taken into account in Eq. (24). Such cancellations are achieved for completely different reasons. In the case of the confocal approach, there is a physical rejection of the out of focus signal before it can reach any detector. When using two-photon excitation, the out of focus signal is simply absent at all.

So Eq. (24) becomes:

$$\begin{aligned} o\left( x,y,z\right) _{j}=i\left( x,y,z\right) _{j}s \left( x,y,z\right) _{j}+n\left( x,y,z\right) \end{aligned}$$
(25)

The price to pay for such an improvement in the acquisition scheme is given by the fact that both methods rely on a point scanning interrogation of the sample. This simply means that the image is formed after the whole scanning is completed that results in a longer time need to form the image with respect to wide-field condition. Increasing the scanning speed has the drawback of reducing the number of photons collected, i.e. dealing with the worst signal-to-noise ratio.

4.2 Confocal laser scanning microscopy

In contrast with the situation described for a wide-filed optical microscope, the image formation in a confocal laser scanning microscope is intrinsically different [70]. The key feature in confocal laser scanning microscopy is related to its the capability of collecting the signal selectively from the plane where the focus of the lens lies disregarding at different grades of cancellation of unwanted contribution from out of focus regions [71], Fig. 20. This result is achieved by realizing the following conditions:

  1. 1.

    Illumination is focused to a spot much smaller than the usual field of view within the specimen through a very small aperture called pinhole. The benefits of such a pointlike illumination limit the overall excitation of fluorescent molecules in the sample, reducing background contributions.

  2. 2.

    The light emitted from regions above and below the current plane of focus is physically blocked from reaching the detector by means of a second pinhole, or of the same one, depending on the architecture of the system. Figure 21 shows a typical confocal laser scanning microscopy set-up, including the integrated image scanning microscopy [73] module [72] allowing unprecedented results in the temporal domain [74]. These mechanisms are often referred to as the “confocal principle”, Fig. 22. To acquire an image, the excitation light has to be fully delivered to each point of the sample, acting like a pointillist painter using a brush, and the emission signal collected by a photodetector and displayed at the end of the scanning process, usually a raster scanning.This is generally achieved by means of two different work plans [53].

Fig. 20
figure 20

Conventional and confocal optical microscopy image of a plasmacytoma cell stained for endoplasmatic reticulum and left: confocal image, right: widefield image. The confocal sectioning effect allows structural details of the cellular reticulum to be seen. Such details are buried in the sea of fluorescence produced by a thick sample [71]

Fig. 21
figure 21

A classical confocal laser scanning microscopy set-up extended to image scanning microscopy modality that is implemented by using a SPAD (single-photon avalanche detector) array as an alternative to the conventional PMT (photomultiplier tube) single point detection [72]. As depicted in the figure, the SPAD array can be placed in an image plane as it happens for the pinhole. Its sensitive area is comparable with typical pinhole size and the module can be easily adapted to fit in any confocal microscope.

One is based on the sample raster scanning in a way that, over every fixed period of time, the point by point generated information from the focal plane is collected and the emitted light signal, usually detected through a photomultiplier tube or and hybrid detector, is displayed by mapping each single-point light emission [74]. Sometimes the use of a one-direction moving slit, rather than a single point, is preferred for speeding up the scanning rate despite that this leads to an evident worsening of the spatial resolution and of the 3-D imaging capability.

A second approach to generating confocal images consists of employing a multi-pinhole Nipkow spinning disk, which is a disk containing multiple sets of spirally arranged pinholes placed in the image plane of the objective lens [75]. A large parallel beam of light is then pointed to a particular region of the disk, and the lights passing through the illuminated pinholes are focused by the objective lens straight onto the specimen. When spinning the disk at a fast rate, the sample undergoes excitation several hundred times per second. The emitted light is typically collected and imaged by a high-resolution and high quantum efficiency CCD camera or an array of sensors like modern SPAD array devices [76]. One meaningful advantage in the approach with respect to the previous one is an improvement of temporal resolution without compromising the spatial one.

Fig. 22
figure 22

Different optical solutions for scanning microscopes towards the implementation of a confocal microscopy architecture. a Conventional microscope. b A microscope scanning a point detector through the image plane so that it detects light from a small region of the object at a time, thus building up a picture of the sample point by point. c It is the counter arrangement of b, having the same imaging properties. The point source illuminates one tiny region of the object, while the large area detector measures the intensity of the transmitted light. d it is a combination of those in b and c. The point light source illuminates one small region of the sample, and the point detector detects light from the very same area. An image is built up by synchronously scanning the source and detector. It is the confocal scanning microscope [70]

As for what concerns optical sectioning, every architecture is built such that the sample is placed along the light path at a conjugate focal plane and the movements along the optical axis, realized by moving the lens or the position of the sample stage, maintain the focus at a fixed distance from the objective, making it possible to effectively scan different fields of view through the specimen and collecting a series of in-focus optical slices for 3-D reconstruction. The degree of confocality is readily a function of the pinhole size [77]. The use of smaller pinholes improves the discrimination of focused light from stray one, thus involving a thinner plane in the image formation process and improving resolution, at the cost of a lower light throughput, which makes things difficult when dealing with particularly dim samples. For confocal architectures, z-resolution and optical sectioning thickness, mainly depend on the numerical aperture of the objective lens, the wavelength of the excitation/emission light, the pinhole size, the refractive index mismatch of components along the light path, and least but not last the overall alignment of the instrument. The architecture reported in Fig. 21 and the image acquisition modalities related to the raster scanning is in common with the super-resolved approaches that we will discuss in the optical nanoscopy section known as STimulated Emission Depletion( STED) microscopy that in terms of optical performances differ only for the adoption of a second laser beam sharing with the confocal microscope its main properties. An effective theoretical model for describing the properties of the confocal microscope needs some preliminary, realistic assumptions to be done to simplify calculations. Under this point of view, the use of a linear space invariant model is appropriate and allows to develop suitable mathematical tools for the analysis of most concrete situations.

Let us consider that the confocal set-up, Fig. 21, can be seen considering the effect of two lenses focused on the very same focal position. A point-like light source is focused onto some sample focal plane “j” through a lens \(L_{1}\), condenser, and the emitted radiation from the very same point of the specimen is collected through a second lens \(L_{2}\) , objective, by a point detector. Let \(h_{ex}\) and \(h_{em}\) be, respectively, the Dirac’s impulse response of \(L_{1}\) and \(L_{2}\), i.e., the lens response to an input point-like light source. \(h_{ex}\) and \(h_{em}\) coincide with \(s\left( x,y,z\right) _{j}\) at the focal plane, Eq. (25). As an x–y–z scanning process is generally coupled to the formation of the image that collects point by point the intensity used to visualize the sample, for a general point \(\rho \left( x,y,z\right) \) one can write that

$$\begin{aligned} I=h_{2}(x,y,z) \end{aligned}$$
(26)

which is the general expression of the so-called point spread function, PSF, of the optical system, i.e. the system impulse response. A mathematical expression for h(x,y,z) can be obtained through the electromagnetic waves scalar theory [55]. The formulation, lying on Frauenhofer diffraction, considering points, respectively along the optical axis and in the focal plane, leads to:

$$\begin{aligned}&h\left( 0,v\right) \propto \left[ \frac{2J_{1}\left( v\right) }{v}\right] ^{2} \qquad h\left( u,0\right) \propto \left[ \frac{sin\left( u/4\right) }{u/4}\right] ^{2} \end{aligned}$$
(27)
$$\begin{aligned}&I\left( 0,v\right) \propto \left[ \frac{2J_{1}\left( v\right) }{v}\right] ^{4} \qquad I\left( u,0\right) \propto \left[ \frac{sin\left( u/4\right) }{u/4}\right] ^{4} \end{aligned}$$
(28)

where \(u\propto z\) and \(v\propto \sqrt{x^{2}+y^{2}}.\) As for conventional microscopes, the evaluation of the FWHM, accounting for the system resolution, leads to a resolution improvement with respect to the wide-field case by a factor a 1.4, which turns into a factor of 3 in terms of volume.

Confocal microscopy can be considered one of the most significant advances in optical microscopy within the last decades, and it has become a powerful and well-disseminated investigation tool for the molecular, cellular, and developmental biologist; the materials scientist; the biophysicist; and the electronic engineer. It is entirely compatible with the range of “classic” light microscopic techniques and, at least in scanned beam instruments, can be applied to the same specimens on the same optical microscope stage [53]. Today the integration of confocal imaging with additional parameters related to the fluorescence process like lifetime allows accessing to a high-content data set providing new details about the sample being studied. For example, TauSense is a new addition for the most advanced confocal system available today. The incredible enhancement is enabled by the advent of fast FPGA acquisition technology, which can substitute the most used time-correlated single-photon counting (TCSPC) approach. In fact, pixel by pixel readout of the photon arrival time is achieved by time tagging method. Although time resolution cannot reach the performance of TCSPC systems, it is enough for most of the microscopy applications and allows simultaneous imaging modality, i.e., intensity detection, average arrival time (TauContrast), gating (up to 16 time-gates simultaneously, digitally tunable TauGating), and a powerful lifetime based component separation algorithm (TauSeparation). In summary, the technical key lies in the use of fast FPGA based acquisition hardware, photon-counting detectors, hybrid detectors and a tunable white light laser illumination source, Fig. 23.

Fig. 23
figure 23

These two images of the root-hypocotyl-junction of Arabidopsis thaliana.represent the current frontier of biological visualization. Life-Act Venus (Actin filaments; Era et al. Plant Cell Physiol., 2009), Propidium Iodide (Cell wall), Chlorophyll (Chloroplasts). a Fluorescence intensity. b TauContrast on Leica STELLARIS 8. Sample courtesy: Dr. Melanie Krebs, COS, University of Heidelberg

4.3 Two-photon excitation microscopy

Two-photon and multiphoton excitation microscopy are probably the most relevant advancement in fluorescence optical microscopy since the introduction of confocal laser scanning microscopy in the 80s [78]. As a direct consequence of the non-linear excitation process of fluorescent molecules the 3D selection ability is coupled with almost five other interesting capabilities, namely: (1) overall reduction of photo-interactions enabling long term imaging of living samples; (2) high-sensitivity within a background-free acquisition scheme; (3) imaging of turbid and thick specimens down to a depth of > 1 mm; (4) simultaneous excitation of different fluorescent molecules reducing 3D colocalization errors; (5) access to photochemical reactions within a subfemtoliter volume inside solutions, cells, and tissues. The fluorescence intensity that one can collect under two-photon excitation of fluorescence can be evaluated as

$$\begin{aligned} I_{f}\propto \sigma _{2}P^{2}\left( \frac{\left( NA\right) ^{2}}{hc\lambda }\right) ^{2} \end{aligned}$$
(29)

where \(\sigma _{2}\) is the two-photon cross-section, P the laser power and (NA) is the numerical aperture of the focusing objective lens. The last term of Eq. (29) reveals the use of the distribution in time and space of the photons by using a paraxial approximation in an ideal optical system. This leads to the most popular relationship [45] reported below regarding the practical situation of a train of beam pulses focused through a high numerical aperture, NA, objective, with a duration of tp and fp repetition rate. In this case, the probability na, that a certain fluorophore simultaneously absorbs two photons during a single pulse, in the paraxial approximation, is given by

$$\begin{aligned} n_{a}\propto \frac{\sigma _{2}P_{ave}^{2}}{\tau _{p}f_{p}^{2}}\left( \frac{\left( NA\right) ^{2}}{hc\lambda }\right) ^{2} \end{aligned}$$
(30)

where \(P_{ave}\) is the time-averaged power of the beam and \(\lambda \) is the excitation wavelength. Introducing 1 GM (Goppert– Mayer)\(=10^{-58}\,\left[ \mathrm{m}^{4}s\right] \), for a \(\sigma _{2}\) of approximately 10 GM per photon, focusing through an objective of \(NA>1\), an average incident laser power of 1–50 mW, operating at a wavelength ranging from 680 to 1100 nm with 80–150 fs pulse-width. 100 MHz repetition rate means 10 ns among pulses, a time that is too short to allow a complete fluorescence decay that is independent by the way the fluorophores has been excited, e.g. one or two-photon. Such a condition can quickly saturate the fluorescence output. Therefore, for optimal fluorescence generation, the desirable repetition time of pulses should be one order of magnitude longer than a typical excited-state lifetime, which is a few nanoseconds for commonly used fluorescent molecules. In terms of optical consequences, the two-photon effect has the important consequence of limiting the excitation region to a sub-femtoliter volume. Moreover, the 3D confinement of the 2PE volume can be understood based on optical diffraction theory and referring to the very same optical sectioning scheme of the confocal microscope. Only in the focal region, Fig. 24, one has enough photon density to prime 2PE. A 2PE microscope can be constructed from components or, using a very efficient compromise by modifying an existing confocal laser scanning microscope, Fig. 25 left panel [79]. This last situation brings to operational flexibility and good quality to cost ratio. Two approaches are used to perform 2PE microscopy, namely, descanned and non-descanned mode. The former uses the very same optical pathway and mechanism employed in confocal laser scanning microscopy. The latter mainly optimizes the optical pathway by minimizing the number of optical elements encountered on the way from the sample to detectors and increases the detector area. Non-descanned is simple and effective. The scanning mirrors of the confocal module are used to prime fluorescence point by point within the sample, whereas on the way back fluorescence emission is collected immediately after the lens objective and its intensity is attributed, without ambiguity, to the actual scanned point. Fig. 25, right panel, illustrates these two approaches. 2PE non-descanned mode allows excellent performances resulting in a superior signal-to-noise ratio inside strongly scattering samples. Such a collection efficiency is also a key feature when dealing with thick intrinsically fluorescent specimens, Fig. 26. Moreover, due to the high laser intensity and architecture flexibility, such set-up is also suitable for second harmonic generation [80], pump-probe [81] and Mueller matrix [82] microscopy. Figure 27 shows the combined images. Coupling 2PE with super-resolved fluorescence microscopy is a challenging issue both under coordinate stochastic [83] and coordinate targeted [84] scheme.

Fig. 24
figure 24

Probability of 2PE event within the double cone of illumination controlled by the lens parameters

Fig. 25
figure 25

Left panel: illustration of the elements of an inverted 2PE microscope also allowing second harmonic generation and pump-probe data imaging, including SPAD array detection. Right panel: descanned and not-descanned detection schemes in an upright 2PE microscope. Since 2PE selects the fluorescence emission volume, the use of scanning mirrors for localized detection can be abandoned. The pinhole in the descanned path can be opened, like in the left panel but still relevant photons can be lost respect to the non-descanned path

Fig. 26
figure 26

Example of optical slicing in 3D using the autofluorescence in membranes that form cysts in Colpoda (courtesy Paola. Ramoino, University of Genoa). The optical sections are taken step by step through the sample

Fig. 27
figure 27

Collagen fibre arrangement and functional crimping pattern of the medial collateral ligament in the rat knee visualized by 2PE autofluorescence (blu), forward (green) and Backward (red) SHG multimodal imaging. Exc. wavelength 860 nm; 63 \(\times \)  0.9 NA objective. Mosaic made with ImageJ software (NIH, Bethesda, US) from 14 different images. Sample courtesy of L.Leonardi, University of Bologna, I

5 Optical nanoscopy

Far-field optical microscopy using a glass lens and visible light is unique in offering the possibility of studying the living matter, living cells and organisms at a low level of perturbation. This key aspect is the one that makes optical microscopy a powerful instrumental method, despite the intrinsically limited spatial resolution, to study biological systems [85]. Cellular DNA and protein distributions can be imaged only within a certain spatial window in terms of details, of mutual interactions and of dynamic structural relationship with cell functioning. In the last thirty years, the direction to achieve spatial super-resolution became evident with the development of super-resolved far-field fluorescence optical techniques that turned microscopy into nanoscopy. In a comparatively short time, acronyms of the most emerging approaches entered in the research laboratories. Techniques such as stimulated emission depletion (STED), reversible saturated optical fluorescence transitions (RESOLFT), photoactivation localization microscopy (PALM), stochastic optical reconstruction microscopy (STORM) or structured illumination microscopy (SIM) allowed circumventing the limitations caused by diffraction. While SIM achieves a significative improvement in spatial resolution compared to conventional optical microscopy, STED, RESOLFT and PALM/STORM, pushed the achievement of optical image resolution to the nanoscale. The Ernst Abbe’s testimonial in Jena is over. It is a paradigm shift in optical microscopy since molecular states and the transitions between them are the enabling elements of optical nanoscopy. There is a roadmap for optical nanoscopy that is developing fast. Let’s explore the basics [86].

5.1 Coordinate stochastic super-resolved fluorescence microscopy

This class of super-resolved methods is based on the individual localization of single fluorescent molecules sparsely distributed in space [87]. The key of the process lies in the localization of the centroids of single emitters of a stochastically activated subset of fluorescent molecules. Localization precision turns into a potentially highest lateral resolution. The main drawbacks are related to the fact that this kind of methods mostly work with fixed samples and depend on labelling density. However, to address most of the limitations, a new microscope has been developed using ultrathin light sheets derived from two-dimensional optical lattices able, among a large number of other different biological processes, to get 3D superresolution photoactivated localization microscopy [88].

Now, the width of a diffraction-limited spot image d, governed by Eq. (1), is around 200–300 nm, assuming an emission wavelength \(\lambda \) \(\approx \) 550 nm, in the green region when using an objective having a numerical aperture NA \(\approx \) 1.4. This can be turned into a precision of localization scaled from 10 to 100 times with respect to the Abbe’s limit.

Fig. 28
figure 28

Sparse distribution of low concentration clusters of fluorescent molecules. The intensity distribution of the spotted clusters allows identifying single molecules and multiple aggregates

Considering a sparse distribution of single fluorescent molecules experimentally obtained by spin coating on a glass slide a solution at low concentration of fluorescent molecules, the intensity distribution of fluorescent spots and the temporal behaviour of photobleaching allow to locale individual molecules. The intensity distribution is made by multiple of the lowest spot intensity related to the presence of single molecules captured individually or in small clusters [89]. In order to confirm this assumption is necessary to investigate the temporal behaviour. In fact, individual molecule intensity is temporally characterized by a step decay towards the photobleached condition. Clusters made by fluorescent molecules aggregates exhibit a multiple-step decay before reaching the dark state. Figure 28 shows individual molecule and cluster of few molecules imaging of a sparse sample.

Fig. 29
figure 29

The knowledge of dealing with a single molecule as unique photon emitter in a certain position allows to improve the localization precision of a factor function of the number of collected photons, N

Now we assume that every spots is a single emitter and so a pointlike light source. It means they do not have in its proximity, for distances < 200 nm, other emitters, their position can be localized much more accurately than the Abbe’s limit prescription. Considering a circular symmetry, the centre of the fluorescent pointlike emitter, coinciding with its centroid, can be determined with a precision better than 200 nm. The localization precision, the standard error of the mean position, is inversely proportional to the square root of the total number of photons collected. The more photons collected, the sharper the circle of the uncertainty of the position is Fig. 29. Its relationship to the diffraction width—the wider the mountain, the more difficult it is to localize the mountain top is given by:

$$\begin{aligned} \delta \approx \frac{d}{\sqrt{N}} \end{aligned}$$
(31)

where N is the number of collected photons per molecule, and d is derived from Eq. (1). NN in a real experiment will be a single peak Gaussian distribution of counts. The mean value is a characteristic of each fluorescent species and can be influenced by environmental conditions. Considering d = 250 nm and 10,000 photons, the localization of the centroid of the pointlike emitter scales to \(\delta \) \(\approx \) 2.5 nm. The necessary condition is given by the fact of dealing with a single molecule fluorescent emitter [90]. More specifically, a localization precision, typically ranging from 5 nm to 30 nm, can be estimated by the following relationship that takes into account some parameters related to the acquisition and digitalization process [91]:

$$\begin{aligned} \sigma _{x,y}^{2}\approx \frac{s^{2}+\frac{a^{2}}{12}}{N}+f\left( b,N^{2}\right) \end{aligned}$$
(32)

Where s and a are related to the width of the point spread function and to the pixel size. The parameter b is the background noise. Under some general conditions, the function f can be approximated by [92]:

$$\begin{aligned} f\left( b,N^{2}\right) \approx \frac{4\sqrt{\pi }}{aN^{2}}s^{3}b^{2} \end{aligned}$$
(33)

This relation shows that the uncertainty falls as the inverse of the number of photons for the background noise and as \(\sqrt{N}\) for the photon counting noise. Here it is self-evident that the role of the background increases with the thickness of the specimen being imaged.

The effective spatial resolution is affected by the molecular density and also by the distance between contiguous molecules. For this reason, the overall estimated resolution of the system should take into account the localization precision and the molecules sparseness. The spatial resolution can be estimated considering:

$$\begin{aligned} d=\sqrt{\sigma _{x,y}^{2}+r_{NN}^{2}} \end{aligned}$$
(34)

where \(r_{NN}\) represent the nearest-neighbour distance between the molecules and \(\sigma _{x,y}\) is the localization precision [90]. The background signal significantly increases in the case of large scattering biological samples. Several are the factors limiting the resolution, mainly related to scattering and aberration effects. To consider additional errors induced in the localization process that can contribute to a decreased effective localization precision, the precision can be redefined by also considering the standard deviation \(\sigma _{inst}\) of the instabilities of the system:

$$\begin{aligned} \sigma _{eff}^{2}=\sqrt{2\sigma _{x,y}^{2}+\sigma _{inst}^{2}} \end{aligned}$$
(35)

where factor 2 takes into account, for example, for the excess noise introduced by the electron-multiplying process of charge-coupled devices, EMCCD. The basic idea behind this technique relies on the possibility to switch a fluorophore or a fluorescent protein between a “dark” and a “bright” state. In order to obtain a sparse subset of emitters distinguishable in the image, molecules are driven into the bright state by photoactivation using a dose of light able to switch on a low density of photoactivatable fluorescent molecules. Readout ends when photobleaching occurs. Positions can be localized with nanometer precision, and new ones are turned on. The cycles are repeated until enough molecules are acquired, and the final image provided by the sum of all the localized events can be reconstructed.

Figure 30 shows the general set-up. A number of approaches demonstrated the capability of such a strategy to localize molecules with nanometer accuracy [93]. Among them the following: fluorescence imaging with one-nanometer accuracy (FIONA) [94], photoactivation localization microscopy PALM [92], fluorescence photoactivation localization microscopy (FPALM) [95], stochastic optical reconstruction microscopy (STORM) [96], and ground-state depletion imaging (GSDIM) [97]. The differences among the various methods depend on the algorithm used for localization molecules, e.g. tracking single molecules for FIONA or type of fluorescent molecules, e.g. fluorescent proteins for PALM and FPALM while dyes for STORM and GSDIM; or microscopy techniques used, e.g. widefield microscopy for FPALM while total internal reflection microscopy (TIRFM) for PALM; or photo-switching principle, e.g. ground state depletion for GSDIM.

Along the z-axis, induced astigmatism in the optical system can be used to discriminate single molecules located in different optical planes. In fact, the perfect round shape due to the emission from pointlike sources is broken by astigmatism given rise to an individual single molecule signature having an ellipsoidal shape. The preferential elongation of the axes of the ellipsoid provides information about the position of the emitter along the illumination axis. Calibration of the system allows getting a precise position along the z-axis. Figure 31 shows an example of the effect of spatial localization in terms of lateral resolution improvement along with the x–y–z coordinates. Quantitative approaches for individual molecule localization are the key element for forming an image at the nanoscale. Most of the developed approaches range from molecular counting approaches based on photobleaching[98] to cluster analysis [99]. Cluster identification is supported by pair correlation and the Ripley function. The analysis is sometimes complicated by the use of immuno-fluorescence labelling techniques since the uncontrollable stoichiometry can increase the over-counting phenomenon, thus making an accurate estimation even more challenging. Solutions based on controlled DNA nanostructures [100] allow addressing for the labelling density and for the fluorophore photo-physics. In particular, a 12-helix programmable DNA origami scaffold [101] has been demonstrated to be a powerful tool for quantitative protein copy number estimation [58]. So far, since their effective early applications at the molecular level [102], localization-based techniques have demonstrated vast applicability to a wide range of biological contexts: from imaging of membrane protein distribution to imaging of nuclear complexes. N is the number of localizations/molecule, fluorescence intensity, photobleaching blinking rate, and distances among proteins attached at specific anchors allow precise molecular quantification in super-resolution microscopy [103]. In fact, the precise knowledge of the localization events for a controlled number of molecules permits the estimation of the number of proteins organized in clusters in a given specimen. The calibration ruler made by DNA origami technology provides a calibration function, \(f_{n}=f_{n-1}\otimes f_{1}\), the convolution with a log-normal distribution \(f_{1}\left( x\right) =\frac{1}{x\sigma \sqrt{2}\pi }e^{-\frac{\left( \ln x-\mu \right) ^{2}}{2\sigma ^{2}}}\).

Fig. 30
figure 30

Single molecule localization microscopy is based on a wide-field scheme allowing photoactivation, illumination and detection and endowed of a high-sensitivity camera

Fig. 31
figure 31

Image of cytoskeletal structures (microtubules labelled with Alexa 647) in HeLa cells. a 3D STORM image of microtubules. The colours represent the distribution of molecules along the optical axis where 0 is the center of the objective focal position. b Corresponding wide-field image. Localization precision: lateral 20 nm, axial 65 nm. Scale bar \(5\,\upmu \mathrm{m}\). Image was taken using N-STORM Nikon by F.Cella Zanacchi at Nikon Imaging Center@IIT

\(f_{n}\) is the distribution of the number of localizations for a structure composed by n molecules, that can be linked to the localization distribution function \(g\left( x\right) \), extracted from cluster identification, as a linear combination of the “calibration” distributions \(f_{n}\) weighted by \(\alpha _{n}\):

$$\begin{aligned} g\left( x\right) =\sum _{n=1}^{N}\alpha _{n}\cdot f_{n}\left( x\right) \end{aligned}$$
(36)

where the summation over n of \(\alpha _{n}\) is 1 [104]. Since the interactions among molecules are cell-to-cell, it is a fundamental issue deciphering such mechanisms in thick cell aggregates such as tumour spheroids [105]. In case of a thick specimen, specific individual molecule fluorescence is immersed in a sea of fluorescence coming from adjacent three-dimensional regions or due to some intrinsic unwanted fluorescence. This is a significative part of the background parameter, b, affecting localization precision, see Eq. (33). In order to reduce the background signal, a solution can be moving to a different scheme of illumination specifically designed for detecting individual fluorescent molecules embedded in thick specimens. This kind of problem was faced and solved by Richard Zsigmondy that introduced the concept of selected plane illumination microscopy, SPIM, to study gold particles behaviour in colloidal solutions. His study deserved the Nobel Prize in 1926. Gold particles, too small to be visible in an ordinary light microscope and usually suspended in a liquid, were illuminated with a light beam perpendicular to the optical axis of the microscope developed with H. Siedentopf [106]. This optical set-up was named ultramicroscope, Fig. 32, for its ability to detect “invisible” gold particles through light flashes released against a dark background. Later, the Orthogonal Plane Fluorescence Optical Sectioning (OPFOS) microscope was implemented [107]. The same concept and optical arrangement were adapted in a very effective way at Stelzer’s lab, developing the Selective Plane Illumination Microscope (SPIM) [108], Fig. 33. Figure 34 shows the combination of SPIM with individual molecule detection enabling localization detection at different depths within a thick tumour spheroid. Such an approach has been named Individual Molecule Localization-Selective-Plane-Illumination Microscopy (IML-SPIM). It demonstrated three-dimensional super-resolution live-cell imaging through thick specimens, 50–150 \(\upmu \)m [105]. IML-SPIM has also been implemented under a two-photon photoactivation regime, particularly suitable to maintain a homogenous thickness in the light sheet under increased scattering, usually affecting thick samples [109].

Fig. 32
figure 32

Original drawings of the ultramicroscope developed in the early years of 1900

Fig. 33
figure 33

Selective plane illumination or light-sheet microscope is the modern evolution of the ultramicroscope, allowing background reduction a wide-field detection scheme

Fig. 34
figure 34

IML-SPIM allows single molecule localization in thick specimens. Here is applied to a cellular spheroid navigating in a range of hundreds micron

5.2 Coordinate-targeted super-resolved fluorescence microscopy.

Reversible saturable optical fluorescence transition, RESOLFT, is the essence, the substance, of those optical fluorescence approaches that turned into super-resolved fluorescence optical microscopy [110]. The central concept is based on the existence of two distinguishable and controllable states A and B whose transition can be optically operated and is saturable. Thinking about a fluorescent molecule, we can consider a dark A state and a bright B state that can be switched in space and time. The diffraction limit does not need to be crumbled. It is simply part of the process of image formation that takes advantage of a set of additional information that leads to super-resolution as predicted by Giuliano Toraldo Di Francia. What matters is that two molecules emitting photons at the very same wavelength are not located at distances closer than the diffraction limit. Let’s consider a fluorescent label that can be controlled in terms of reversible switching between a bright and a dark state, \(B\)\(\leftrightarrow \) \(A\). The key experiment is comparatively simple. A cluster of fluorescent molecules is excited by a focused illumination beam producing a fluorescence intensity If. For some reasons, a second beam focused on the very same cluster is able to cancel part of If as a function of its intensity, Ib, saturating the process towards If = 0. Increasing Ib produces a decrease of If and vice-versa. This means that the second beam is able to control the amount of fluorescent molecules that emit at a certain wavelength in a reversible way. By shaping the second beam, one can control in time and space behaviour of fluorescent molecules. Typically, a parameter named saturation intensity \(I\) \(s\) \(a\) \(t\) can be introduced defining the intensity of the switching beam needed to convert 50\(\%\) of the emitted signal from B to A. \(I_{sat}\) is a characteristic parameter related to the specific molecule being used. The result is that one can determine an effective focal region containing emitting molecules whose size can be driven to be significantly smaller than the diffraction limit. For such a process no theoretical limit of the resolution is given since the final dimension of the effective bright region is ruled by the efficiency of the switching. Figure 35 sketches the RESOLFT concept. Experimentally, RESOLFT has been implemented in different ways using ground-state depletion [112], photoswitchable proteins [113], sub-diffraction direct laser writing lithography [114], and stimulated emission depletion, STED [115] which will be discussed in details. Stimulated emission depletion [116] is the fundamental process used to engineer the point spread function of the microscope of arbitrarily small size in a coordinate targeted way, Fig. 36. This implies that during the point by point beam scanning process, a second beam is superimposed to define the bright and dark region at the focus position. This method has an immediate physical effect on the image formation, like the confocal pinhole or the two-photon excitation mode. Due to its immediacy, it is suitable for live-cell imaging [117].

Fig. 35
figure 35

The RESOLFT concept at the basis of super-resolved fluorescence microscopy. RESOLFT requires a two states A and B of a label that are distinct in their optical properties. The optical transition from A to B takes place at a rate \(k_{AB}=\sigma I\) that is proportional to the light intensity I applied. The reverse transition from B to A of rate \(k_{BA}\) brings the label back to its initial state. b The profiles 1–4 show the spatial region in which the label is allowed to be in state A, if the region is subject to a standing wave of light with peak intensities \(I_{0}=10,50,100\), and 500 times \(I_{sat}\) and with a zero at \(x_{i}\). Increasing \(I_{0}\) ensures that the region in which the label may reside in A is squeezed down, in principle, indefinitely. If A is the fluorescent state of the label, this ultrasharp region functions as the effective fluorescent spot of the microscope and \(D_{x}\) is its FWHM. The creation of a fluorescence image requires scanning that is moving the zero along the x-axis with subsequent storage of the recorded fluorescence. If B is the fluorescent state, then the ultrasharp regions of state A are dark. In this case, a sort of negative image is recorded. Nevertheless, with suitable mathematical postprocessing, similar optical resolution can be obtained. In any case, the resolution is no longer limited by diffraction, but only determined by the value of \(I_{0}/I_{sat}\). c The simplified energy diagram of a fluorophore depicts possible schemes for implementing saturable optical transitions. Permission from [111]

Fig. 36
figure 36

STED implementation of the RESOLFT concept. Doughnut-shaped beam light is used to shrink the fluorescence emission, point by point

Fig. 37
figure 37

Optical architecture for STED and RESOLFT, integrating both single point or array detection unit. The two detectors are interchangeable depending on the final spatial resolution the user want to achieve. In general, the spad array is more flexible. Detectors, independently by the category, can be more than one depending on the number of color channels needed for the experiments

In the STED microscope, Fig. 37, a lens focused excitation beam is overlapped with a doughnut-shaped second beam that is typically red-shifted with respect to the fluorophore’s emission peak. The two most relevant technical features for the second beam, named STED beam, are related to the following considerations, namely: (1) the STED beam wavelength should be set in a spectral region having a low probability of absorption; (2) the intensity distribution of the STED beam has to drop to zero at least in a controlled region within illumination focal volume, typically the centre of the beam. The mostly used doughnut-shaped beam features a zero intensity point in the centre of the beam with an intensity profile driven by the diffraction-limited point spread function of the focusing lens coupled to the power of the STED beam, the STED beam depletes the fluorescent molecular state everywhere within the diffraction-limited excitation volume, by means of stimulated emission, except at the centre and its neighbourhood where the probability is very poor due to the cross-section at the STED beam wavelengths. Any fluorescent molecule is suitable for stimulated emission depletion. Recently, optimized fluorescent molecules have been designed and used without a lack of generality for the approach [118]. Figure 38 shows the excitation-emission-depletion scheme. The extent of depletion, i.e. the fraction of excited fluorophores that undergo stimulated emission, determines the final achievable spatial resolution of the STED microscope. The probability of stimulated emission depends on the power of the depletion beam and on other factors related to the fluorophore like the emission spectrum, the orientational distribution and rotational behaviour of the dye molecules and to the properties of the depletion light beam.

Fig. 38
figure 38

Stimulated emission depletion of fluorescence. Following the excitation, \(S_{1}\) is forced to be depopulated. This reduces the fluorescence emission in a targeted region.

An efficient way to take into account these factors is given by the evaluation of \(I_{s}\), the saturation intensity. As anticipated in the example given before, saturation is towards zero, in terms of fluorescence, and it refers to the 50\(\%\) forced depopulation of the excited state. Considering the fluorescence emission behaviour F, one has [119]:

$$\begin{aligned} F(x)=e^{-ln\left( 2\right) \frac{I_{STED}\left( x\right) }{I_{S}}} \end{aligned}$$
(37)

considering the diffraction-limited point spread function in terms of waist w as

$$\begin{aligned} h_{CONF}\left( r\right) =e^{-\frac{2r^{2}}{w^{2}}} \end{aligned}$$
(38)

and the STED beam profile in the proximity of the centre approximated by

$$\begin{aligned} I_{STED}\left( r\right) =I_{STED}\frac{r^{2}}{w^{2}} \end{aligned}$$
(39)

This rates how the point spread function is engineered since the effective STED point spread function has the following form:

$$\begin{aligned} h_{STED}=e^{-\frac{2r^{2}}{w^{2}}}e^{-\ln \left( 2\right) \frac{I_{STED}\left( r\right) }{I_{S}}}=e^{-\frac{2r^{2}}{w^{2}}}e^{-\ln \left( 2\right) \frac{I_{STED}r^{2}}{w^{2}I_{s}}}=e^{-\frac{2r^{2}}{w_{eff}^{2}}} \end{aligned}$$
(40)

where

$$\begin{aligned} \frac{1}{w_{eff}^{2}}=\frac{1}{w^{2}}\left( 1+c\frac{I_{STED}}{I_{s}}\right) \end{aligned}$$
(41)

with \(c=\frac{ln2}{2}\). Thus the Abbe formula can be corrected, in analogy with what we have done in Eq. (31) and considering the reduction factor applicable to \(w_{eff}\), as

$$\begin{aligned} d'=\frac{d}{\sqrt{1+c\frac{I_{STED}}{I_{s}}}} \end{aligned}$$
(42)

Equation (42) turns in equation (1) when the STED beam is switched off. On the other side if one is able to design a fluorescent molecule having an extremely low saturation intensity that tends to zero. A critical aspect, since the early days of STED microscopy development, concerns the issue that the intensity of the STED beam cannot be increased to the level required for getting a certain resolution due to possible photodamage of the fluorophores and phototoxicity affecting the specimens. As mentioned, in STED microscopy, excitation and depletion are linear processes and so they do not require ultra-short pulses as for 2PE. However, super-resolution is enabled by saturation of the depletion process. Since it depends on the photon density fully continuous wave (cw) laser implementation is possible and has been demonstrated [120]. Nevertheless, one of the best configuration is the pulsed one, with long depletion pulses (\(\sim 1\) ns) and time gated detection [121]. Optimization of the optical set-up cannot escape, at least for the best performing implementations, from a precise synchronization of the illumination beams [122]. At the very same time, high STED beam intensities require high power and high-cost laser sources and prevent the possibility of parallelizing the process when the acquisition speed is an issue to be solved. However, this limitation can be overcome by analyzing the photophysics of the process. The STED depletion beam induces a local perturbation of the fluorescent molecules in the surroundings of the point spread function. Now, unwanted fluorescence is still allowed for those molecules that escaped the stimulated emission within the geometry defined by the depletion beam. In the peripheral regions of the STED beam, the excited fluorescent molecules have an additional chance to go to the ground state, another relaxation rate to take into account into Eq. (5). This fact results in a shorter fluorescence lifetime with respect to the unperturbed regions in the centre of the doughnut beam preserving the original fluorescent molecule lifetime, Fig. 39.

Fig. 39
figure 39

The goal to reduce the fluorescence signal in a specific region introduces a perturbation around the centre of the doughnut beam. Such a perturbation, introducing an additional decay rate, results in a lifetime change function of the distance from the centre

This effect allows maintaining spatial resolution in the super-resolved domain reducing the STED power by considering the time of arrival of photons [123]. This is possible realizing a STED implementation that employs pulsed excitation and a continuous-wave, CW, depletion beam. Photons arrival time sort the information to be used when forming an image, short lifetimes belong to the unwanted photons interacting without being depleted with the periphery of the STED beam while long lifetimes can be ascribed to fluorescence emitted in the region of the zero value of the STED beam. To clarify this, let’s reconsider the saturation intensity as the STED intensity at which the lifetime \(\tau \) is reduced by half of its unperturbed value \(\tau _{0}\) such that

$$\begin{aligned} \frac{1}{\tau }=\frac{1}{\tau _{0}}\left( 1+\frac{I_{STED}}{I_{s}}\right) \end{aligned}$$
(43)

The point spread function is turned into

$$\begin{aligned} h_{conf}\left( r,t\right) =e^{-\frac{2r^{2}}{w^{2}}}e^{-\frac{t}{t_{0}}} \end{aligned}$$
(44)

This brings to a very interesting new formulation of the STED point spread function considering the temporal aspects related to the fluorescence process:

$$\begin{aligned} h_{STED}\left( r,t\right) =e^{-\frac{2r^{2}}{w^{2}}}e^{-\frac{t}{\tau _{0}} \left( 1+\frac{I_{STED}r^{2}}{w^{2}I_{s}}\right) }=e^{-\frac{2r^{2}}{w_{eff}^{2}}}e^{-\frac{t}{\tau \left( r\right) }} \end{aligned}$$
(45)

The gradient of fluorescence lifetimes between centre and periphery of the STED beam point spread function is given by:

$$\begin{aligned} \frac{1}{\tau \left( r\right) }=\frac{1}{\tau _{0}}\left( 1+\frac{k_{s}r^{2}}{w^{2}}\right) \end{aligned}$$
(46)

where \(k_{s}=I_{STED}/I_{s}\) quantifies the relative variation of decay rate values within the point spread function region. A new concept regards the separation of photons by lifetime tuning, SPLIT, that allows taking advantage in an effective way of the measure of the time of arrival of photons. In fact, the gradient of fluorescence lifetimes can be exploited to tune the spatial resolution by STED strategies such as gated-STED [124], modulated STED [125] and STED with separation of photons by lifetime tuning (SPLIT) [126], Fig. 40. It should be evident that a typical STED nanoscope is quite similar to a beam scanning microscope, Fig. 40, like the confocal or the two-photon excitation laser scanning microscopes. Its performances can be exploited in 3D, time-lapse and multicolour imaging on living samples. New fluorophores have been designed and realized with optimal performances for each pair of beams [127]. A convenient and promising new class of fluorescent molecules is the one developed with large-Stokes-shift dyes, such as Abberior STAR 470SX (Abberior GmbH, Gottingen, Germany) or ATTO 490LS (Atto-Tech GmbH, Siegen, Germany). In this case, one can get two- or three-colour STED without excessive loss of spatial resolution. A further aspect that one has to take into account is that the reduction of the effective collection volume is accompanied by a reduction of the overall signal which forms the image [128]. Thereby, the signal-to-noise ratio and the signal-to-background ratio reduction impose on paying attention to the development of new sensors and, for example, in case of gated detection, of efficient data processing algorithms capable of reassigning the early photons, instead of simply rejecting them [129]. The SPLIT-STED approach becomes relevant, Fig. 41, considering that collected photons are classified in terms of arrival time and the related distance from the centre of the STED beam is used, along with uncorrelated original background signal, to get an improved result with respect to the mere application of gated-STED, Fig. 42. It is worth noting that any STED configuration can also be implemented under two-photon excitation regime [84]. Among the 2PE-STED implementations, it is worth noting that a novel pulsed 2PE-STED implementation has been demonstrated that uses the very same wavelength for two-photon excitation and for the depletion STED beam [130]. In terms of fluorescent molecules, the main requirement is a fluorophore that can be non-linearly excited and linearly depleted at exactly the same wavelength, e.g. ATTO647n (Atto-Tech Gmbh, Siegen, Germany), one of the most stable dye, hard to photbleach and often used in STED applications. The illumination is split into two branches delivering the very same wavelength at different pulse width. The two-photon excitation beamline is delivered using a 100 fs pulse-width while the STED depletion beam operates at 200 ps in order to avoid 2PE. Such a single wavelength (SW) 2PE-STED approach has the advantage that no other lasers are necessary for STED and that when penetrating in thick samples the possible beam distortions share the very same wavelength with potential advantages

Fig. 40
figure 40

Lifetime changes induced by the doughnut beam allow to add another possibility of control of spatial resolution that can be coupled with the doughnut beam power

Fig. 41
figure 41

Lifetime changes can be used to decode position as a function of the arrival time of photons

Fig. 42
figure 42

Classification of detected photons in terms of arrival time allows to improve the signal-to-noise and to remove uncorrelated background signal. a, b Microtubules in fixed HeLa cells labelled by immunocytochemistry with the organic dyes Alexa Fluor 488 (a) and Oregon Green 488 (b). Shown are the confocal image, the SPLIT (\(n=2\), first component) image, the time-gated image (\(T_{g}=1\) ns) and the intensity profile along the dashed line. The colourmap represents the fluorescence intensity normalized to the maximum value of each image. (c) The time-gated image is compared with the full SPLIT series (\(n=2\)) for the region highlighted in b. The colourmap represents the fluorescence intensity expressed in counts per 0.1 ms. The STED beam power (measured at the back aperture of the objective lens) was \(P_{STED}=40\) mW. Scale bars, \(2\,\upmu \)m. Reproduced with permission from [126]

for aberration correction [131]. Another interesting aspect when dealing with STED is given by the extremely relevant chance to utilize super-resolved fluorescence solutions for the study of molecular dynamics in living cells. Within this goal, Fluorescence Correlation Spectroscopy (FCS) [132] represents an established technique to recover single-molecule diffusion and binding properties in cells. Moreover, scanning microscopy imaging was applied to add a spatial dimension to the classic FCS modality: spatiotemporal FCS (stFCS) provides details about the routes that are followed by the diffusing particles or molecules in the specimen [133]. STED nanoscopy has been recently combined with all these FCS techniques, i.e. STED-FCS [134], raster imaging correlation spectroscopy (RICS) STED [135] and cross-pair correlation spectroscopy (pCF) STED [136]. STED-FCS allowed the following diffusion in three dimensions at different sub-diffraction scales. In this way, measurement of the diffusion of green fluorescent proteins at spatial scales tunable from the diffraction size down to \(\sim \) 80 nm in the cytoplasm of living cells [134]. This opens new challenging perspectives to study molecular diffusion phenomena in biological systems. However, today, one of the most challenging perspectives is the one given by the ultimate goal of biological super-resolution fluorescence microscopy providing 3D resolution at the size scale of the fluorescent reporter being used. The STED donut-shaped has been adapted to show that by localizing individually switchable fluorophores with a probing doughnut-shaped excitation beam, optical nanoscopy can provide resolutions in the range of 1 to 3 nm for structures in fixed and living cells. This progress has been facilitated by the design and realization of the new interrogation concept named MINFLUX in three-dimensions [137]. MINFLUX derives from studies around the development of STED aimed to minimize the photoperturbation of the specimen and of the fluorescent molecules [138]. It relies in minimizing the photon emission fluxes while localizing single molecules, achieving unprecedented resolutions with 2–3 order of magnitude less photons than standard camera-based localization. However, its fundamental difference with STED nanoscopy is that the doughnut pattern does not simultaneously perform both the localization and the on–off transition. STED generates on–off state disparities between two neighbouring points, thus requires a large intensity difference, or contrast, to spatially distinguish between the two states. On the contrary MINFLUX use the doughnut to perform the localization procedure by reading the intensity values in three different points in the space around the “on” molecule. Ideally if the molecule is in the very center of the excitation doughnut no photons will be emitted. Thus, approaching such a position, it ensures the presence of the molecule using less and less photons. The three points are arranged in an equilateral triangle inscribed in a circumference. Its diameter, L, defines the attainable localization precision that does not depends anymore on the number of photons collected. Finally, it is worthwhile reflecting on the fact that MINFLUX nanoscopy has attained the resolution scale where fluorescence molecules start to interact with < 6 nm each other. This is the ultimate limit attainable with fluorophores apart from FRET approaches. While fluorescence on–off switching remains the fundamental requirements for circumventing the diffraction barrier, in MINFLUX this is obtained by the fact that, for small distances between a molecule and the intensity zero, the emitter localization does not depend on any wavelength. A great challenge is that optical nanoscopy can be expanded to low numerical aperture lenses [139].

5.3 Structured Illumination Microscopy

Fig. 43
figure 43

Structured Illumination optical architecture. It is a camera-based set-up implementing Moiré pattern rotation to extend the spatial frequencies detection beyond the classical limit

Fig. 44
figure 44

Generation of harmonics by nonlinear fluorescence. a The optical transfer function (OTF) of a conventional microscope, where Kc is the cut-off frequency. b Shifted frequency components of the object collected by the imaging system, grey circle and for a fringe pattern projected in two different orientations, blue and red circles. c The region mapped by the circles, shown in different colours respectively, extends the frequency spectrum of the object after processing/assembling the different orientations of the projected fringe patterns. d The nonlinear dependence of the fluorescent emission (blue) and depletion (red) rate on the illumination intensity in the saturation regime, increasing illumination power. e Simulation of structured illumination sinusoidal pattern \(\times \) profile for STED (in red) and SIM (in blue) respectively. Different lines of the same colour show the effective saturation pattern at increasing illumination peak pulse energy densities of (from the bottom to top curve) 0.25, 1, 4, 16, and 64 times the saturation threshold [141]

Structured illumination microscopy is based on the computational reconstruction of an image based on the acquisition of a series of images collected using a known shifted pattern of illumination, Fig. 43. The augmented data set produces a super-resolved image with a spatial resolution beyond the one imposed by the diffraction limit. Such a super-resolved structured-illumination microscopy (SIM) solution has been conceived by Mats Gustafsson [86] implementing the information theory concepts analysed and discussed by Lukosz [16], Toraldo di Francia [12]and Sheppard [20]. Here, optical patterns are mixed to generate moiré fringes, a kind of fine-weave cloth pattern. The approach is comparatively simple since it consists in measuring the spatial frequencies produced by the interaction of controlled patterns with the high spatial frequencies of the real sample which are precluded in a convetional microscope because higher than its cut-off frequency, i.e. resolution limit (Fig. 44a). This method solves for the underlying spatial frequencies present in the specimen [140]. This is an effective approach that can be applied in the whole three-dimensional space occupied by the very same living samples prepared for conventional fluorescence microscopy. Optical sectioning and live-cell imaging are obtained, achieving a spatial resolution beyond the Abbe’s limit by extending data collection in the Fourier space. The shifted excitation patterns allow recovering some of the latent spatial frequencies of the illuminated specimen, Fig. 44b, c. The main limitation comes from the fact that the excitation pattern is delivered using a glass lens. This means that the sharpness of the delivered excitation is still diffraction-limited, and the fluorescence process is linear. However, image reconstruction originated by a rotated and phase-shifted luminous pattern can produce an isotropic spatial resolution along the x–y–z axes, Fig. 45. In fact, the resolution enhancement because originates from the extraction of high-resolution encoded details in the form of low-resolution Moiré’s fringes. Different optical strategies can be implemented in terms, including data processing and image reconstruction algorithms [140]. Early approaches within the concept of structured illumination were introduced by Benedetti and colleagues [142], inspired by the pioneering work of Petran [143]. For example, the illumination pattern generated by projecting a movable pinhole mask into the sample for each pattern position can be used in a wide-field fluorescence microscope set-up to form a super-resolved image [144]. However, within the structured illumination layout, for the fluorescent molecules that respond nonlinearly to the illumination intensity (Fig. 44d), the use of sinusoidal illumination can lead to an emission that has contributions from harmonics of the illuminated light frequency, which are uniquely distinct from those of linear SIM, which only delivers contributions from the first-harmonic of the incident light. Starting with linear SIM, one is still limited by the use of lenses in the expansion of the collectable frequencies from the sample. The fringes occur at the difference between the sample structure and each frequency \(k_{p}\) of the illumination pattern. The strength of this effect is proportional to the spatial frequency of the pattern. Unfortunately, diffraction limits the allowed \(k_{p}\) frequencies as in a classic optical microscope. Therefore, if \(k_{0}\) is the highest observable spatial frequency, then structured illumination enables to observe frequencies up to

$$\begin{aligned} k_{0}~+~k_{p}~\approx ~2k_{0} \end{aligned}$$
(47)

producing a limited gain of an about two times the conventional resolution.

Fig. 45
figure 45

Comparison between conventional wide-field microscopy and 3D-SIM. Imaging of BPAE cell with Mitotracker Red CMXRos (red), Alexa Fluor 488 phalloidin (green) and DAPI nucleus (blue) with wide-field microscopy (a) and 3D-SIM (b). The 3D-SIM image has been collected using the 3D-SIM mode, which allows for optical sectioning. Nucleus structure has always been collected in conventional wide-field mode. Scale bar 5 \(\upmu \)m. The images were taken on Nikon-NSIM microscope at Nikon Imaging Center@IIT, I

An improvement that is in common for confocal and structured illumination microscopy. However, the introduction of any nonlinearities endows the system with almost unlimited spatial resolution potential [141]. Remarkably, it is possible to go even beyond it. As it happens for STED microscopy, the use of a saturation mechanism introduces strong non-linearities that allow increasing the transmittable spatial frequencies [145]. Although we apply a diffraction-limited spatial frequency to the pattern of illumination intensity I(r), what is relevant for imaging is the pattern of emission rate E(r) per fluorophore (the effective excitation). Under intense illumination, the effective excitation E(r) can saturate and thus depend nonlinearly on the illumination:

$$\begin{aligned} E=f(I) \end{aligned}$$
(48)

where f is some nonlinear function. Therefore, E(r) will contain harmonics that can be a multiple of the illumination pattern, e.g., \(2K_{1},\) \(3k_{1}\), ..., corresponding to a resolution of \(k_{0}+2k_{p},k_{0}+3k_{p}\), etc., respectively. If f is non-polynomial, an infinite number of harmonics can be generated, so arbitrary resolution is possible.

In practice, the saturated excitation generates high fluorescence intensity that surrounds narrow dark regions in the zero nodes of the pattern (Fig. 44e). While for STED microscopy is the “on” state of a fluorescent probe that is confined at the zero nodes in saturated SIM is the “off” state resulting in a negative signature of the features being imaged. Therefore, while for STED the image acquisition does not require any reconstruction algorithm or deconvolution, with SIM to retrieve such frequencies, the grid lines need to be rotated several times and he collected data will be computed mathematically during a post-acquisition processing to generate the super-resolved image. The structured-illumination approach is spread over different optical microscopy modalities [146]. Among variations on the theme of SIM set-ups, the use of sensitive cameras [147] and of multiphoton excitation [148] can improve the performances at specimen depths \(>100\,\upmu \)m. To conclude with SIM, the main distinctive features of this solution to improve spatial resolution regard the fact that it is compatible with the common organic dyes and fluorescent proteins routinely employed in wide-field optical microscopy allows live-cell three-dimensional imaging and produces low photodamage [149].

5.4 Expansion microscopy

Expansion microscopy (ExM) is a novel method that, upon a specific sample treatment, allows super-resolved fluorescence imaging with conventional microscopes [150]. It has been demonstrated that by synthesizing a swellable polymer network within a specimen, it can be physically expanded, resulting in physical magnification that can be imaged by means of an optical microscope. Figure 46 shows a cartoon of the process of the sample preparation for implementing the expansion microscopy approach. Sample preparation consists of soaking the biological cells in a polymer, inducing the polymerization to form a dense meshwork throughout the cell that cross-links the fluorophores. After digestion of cellular protein and rehydration of the sample, the image formation process can take place. The effect of swelling of the polymer gel leads to an N-fold isotropic stretching of the sample. The separation between objects, that otherwise couldn’t be appreciated, becomes appreciable. Recently, interesting variations have coupled expansion microscopy with STED nanoscopy [151]. Despite the loss of the advantage of using a conventional microscope, there is the benefit of reducing the mechanical stretching of the specimen. Figure 46 also shows an example comparing confocal, expansion and expansion-STED microscopy. Moreover, it has been possible to identify a nanoscale ruler to address the quantitative question related to the strength of the expansion process from millimetres down to the nanoscale by such a coupling [152]. Recently, expanded specimens have been utilized to improve the resolution and imaging contrast of label-free CIDS [82] scanning microscopy . Such a multimodal approach applied to the chromatin-DNA organization in intact cells has been named ExCIDS [153].

Fig. 46
figure 46

The sketch depicts the key steps involved in the expansion process. The sample, e.g. a cell, needs to be stained. After staining, a gelation solution containing acrylamide monomers is added, and the gelation process can start. Once gelation is completed, and gel is cross-linked with proteins and labels in the sample proteins should be digested. Finally, the addition of distilled water allows the full isotropic expansion up to 5 times the original size. The image shows the resulting resolution improvement from conventional confocal imaging on the pre-expanded sample to expanded Sted (ExSTED) imaging of nuclear pores in a cell. The nuclear pores are immunostained with Alexa488 against Nup153. The final resolution is about 20nm. The field of view is \(7\,\upmu \)m

6 Correlative nanoscopy

Correlative nanoscopy is a brand-new denomination [154] coined for the combination of different microscopy methods which correlate two or more different imaging modalities able to provide information at the nanoscale including the possibility of exploring living samples. The aim is to provide a powerful combination of high-resolution approaches allowing to merge specificity and detailed structural information [155]. The match between the atomic force microscope (AFM) [52] and optical nanoscopy offers optical specificity while the sample is scanned by the tip of a sensitive cantilever probe.

Unlike electron microscopy, AFM can deliver images of specimens without requiring a sample preparation that prevents parallel optical imaging. AFM coupled with a super-resolved fluorescence approach gains in specificity that AFM intrinsically does not have since it is related to generic force interaction apart from the utilization of functionalized probes able to interact only with specific molecules [156]. In this scenario, stimulated emission depletion microscopy and stochastic optical reconstruction microscopy is two additions proven to be very useful in a unique way of identifying species from an AFM image. These combinations open-up new functional modalities, like nanomanipulation and fast targeting [157].

Fig. 47
figure 47

AFM: cantilever and tip, representing the core of the AFM, are shown at left. The precise detection of small cantilever deflection is obtained by using a method based on the optical lever principle. A laser is reflected on the backside of the cantilever itself, hitting a quadrant photodiode (QPD). Deflections induce a change of trajectory of the reflected beam that is clearly revealed by the photodetector. The AFM tip is experiencing different force regimes at a different distance from the sample. The different AFM modes are working at different regimes of forces

6.1 Atomic force microscopy

The atomic force microscope, introduced in 1986, is a scanning probe microscopy approach developed as a consequence of the invention and development of the scanning tunnelling microscope in 1981 (STM) at IBM in Zürich by H. Binnig and E. Rohrer [158], awarded with the Nobel Prize in Physics in 1986. The main motivation was related to the attempts made to image DNA and other biological molecules that immediately flourished with the advent of the STM [159]. The atomic force microscope, as any scanning probe microscope, does not provide a direct image of an object. It is based on the mechanical interaction with the atomic structure of a surface studied using a stylus made by a tip attached to a lever that scans the surface reacting as a function of the distance from it, Fig. 47. This made it independent from specific properties of the specimen being imaged, giving the chance to study in an STM like manner also not conductive materials. Its sensitivity to forcing interactions makes it powerful but on the other hand, the lack of specificity, except when the tip is functionalized to recognize specific molecules, is an issue that poses problems in terms of interpretation of the formed images. Vertical adjustment of the stylus is controlled by means of a feedback system sensitive to attraction or repulsion of the tip when a distance-dependent surface interaction occurs. The spatial resolution is determined by the physical dimensions of the tip ideally ending with one atom, the one interacting with the surface atoms of the specimen in air, liquid or vacuum. The use of an extremely sharp stylus with a one atom ending tip enables the AFM to follow even the smallest details of the scanned surface by recording the vertical movements of the stylus. This allows us to get a view of the structure of the surface atom by atom that can be degenerated by environmental conditions. Since then, the development of the electron microscope (EM) has been very extensive. The EM resolving power could be considered theoretically unlimited since the electron is a pointlike particle. Even if one has to take into account a certain uncertainty into the determination of electron positions, that sets a theoretical limit to resolution for the acceleration potentials normally used of the order of 0.5–1 Å, in practice a spatial resolution below 1 Åhas been achieved. For this development, Ernst Ruska joined Rohrer and Binnig in the triad of the 1986 Nobel recognition [160]. As well, recently the cryo version of the EM, allowing to freeze information related to biological samples while they still preserve their physiological conditions, deserved the Nobel award. In fact, the Royal Swedish Academy of Sciences decided to award Jacques Dubochet, Joachim Frank and Richard Henderson the Nobel Prize for Chemistry 2017 for “developing cryo-electron microscopy for the high-resolution structure determination of biomolecules in solution” [161]. However, the AFM is a kind of microscope allowing investigations on living samples even if limited to atomic or molecular resolution at the surface level. So, since the AFM is directly linked to the STM, we can say that the 1986 Nobel Prize for Physics rewarded two radical leaps in microscope technology that finally allowed us to witness life at the atomic level. This kind of technological jump is perfectly in tune with the observations, predictions and visionary view of Richard Feynman in 1959 and the consequent advance of what we call today nanotechnology [162]. Nanotechnology is strictly related to our ability to observe and controlling or manipulating matter at the nanoscale. The concept that matter is made up of tiny atoms has been proposed for millennia [163], but today we can approach the way they are organized using electrons, photons and forces that can be integrated into the very same instrument [164].

Fig. 48
figure 48

Images of the same area are acquired in different modes. The confocal and STED images are acquired in reflection, while the AFM probe reaches the sample from above, providing a three-dimensional topographical view at high resolution

Fig. 49
figure 49

Quantification of STED AFM overlay. The image shows confocal AFM overlap plotted with STED AFM image. The AFM image is used to deconvolve the structure of the fluorescence image, and the resultant convolution integral is plotted in the graphs in the bottom panels

6.2 Coupling forces and photons

Now, AFM has the ability to work in liquid and in controlled conditions without limitations about the nature of the sample. So far, it was the first microscope that provided details at the nanometer scale of biological material in wet/physiological environment. It was possible to image DNA, proteins, molecular assemblies, and cells through different types of imaging modes [165] with a lateral resolution of \(\approx \) 1.1 nm. Exploiting the ability to discriminate forces in the order of tens of picoNewtons, the unfolding process of proteins and the dynamic force spectrum of molecular bonds can be characterized. At a cellular level, the mechanical properties of cells, organelles, and membranes have also been widely investigated using AFM [166]. Despite this wide range of results at the nanoscale on biological samples, the AFM is inherently limited by the lack of chemical specificity. It is not able to recognize the different components of a heterogeneous sample. One can endow of a recognition capability the AFM by using functionalized probes, specifically realized to recognize molecular species of interest [167]. The interaction between the tip and the sample can be mediated by a ligand-receptor binding construct [168]. A simple and immediate way to get specificity is given by the coupling between AFM and fluorescence optical microscopy. Unfortunately, the instrumental integration is constrained by the low optical resolution, and hence the two instruments worked at different spatial resolutions. The coupling between AFM and super-resolved fluorescence microscopy is the key to solve the resolution mismatch, Fig. 48. Both imaging and spectroscopy AFM working modalities have been demonstrated with a combination of super-resolution techniques [157]. In spectroscopy mode, the AFM could be employed to test the mechanical properties of a sample measuring the interaction between tip and sample in single points, acquiring force-distance curves. Force-distance curves analysis provides a detailed description of the nature of the interaction between probe and sample: electrostatic interaction, capillary forces, and van der Waals forces can be accurately measured. Quantitative information on sample properties can be obtained by AFM force spectroscopy mode. In particular, the acquisition of a dense matrix of force-distance (FD) curves allows both the topographical and mechanical characterization of the sample with a lateral resolution in the order of tens of nm [154]. Among the super-resolution techniques, both STED and STORM have been applied in combination with AFM for high-resolution identification of the sample structures, which is correlated with the AFM topology. STED uses scanning while STORM uses a widefield scheme and depends on stochastic switching off from the molecules. STED imaging can gain a higher contrast with the help of several different implementations, such as gated STED or modulated STED. On the other hand, STORM images exhibit an image quality improvement based on the molecular density of labels, better single-molecule recognition algorithms, and increasing the number of photons collected per molecule. This makes strongly desirable having an optical architecture able to allow switching from STED to STORM optical approaches [157]. AFM-STED was the first combination that demonstrated correlative nanoscopy STED microscopes provide fast super-resolution technique because of its inherent direct resolution enhancement which allows the AFM to work in a targeted mode and a high precision. The STED effect is limited to 2D, and AFM tip at a different height cannot interfere with the STED imaging. The results are compared with confocal imaging for 20 nm and 40 nm fluorescent beads in terms of species separation. AFM and STED show similar details in the structure, but in order to be a side-by-side comparison of confocal or STED image overlapped with an AFM image is shown to get a graphical idea of precision achieved. Convolving the fluorescence image with the AFM image and plotting the resultant distribution of the overlap one finds a point spread function five times thinner than the confocal correlative image, Fig. 49. STORM AFM is another correlative nanoscopy solution more devoted to fixed samples that require a molecular-level specificity. It has been applied in the study of cytoskeletal structures, DNA studies, nuclear membrane studies, and so on. STORM AFM has been applied to the study of many macromolecules because of its ability to visualize the molecule and its structure. Another aspect of such a correlative technique lies in its ability to aid and improve the data from the parallel technique used. Fluorescence validates and confirms the AFM topography and provides specificity to AFM imaging, while AFM topography answers the possibility of other unlabeled features in the structure and indicates how to avoid errors in labelling reducing the probability of image reconstruction artefacts. A recent article [169] demonstrated that most of the \(\lambda \) -DNA imaged with AFM correlates with the super-resolved image and ruled out negligible effects of incomplete labelling, rapid photobleaching, poor signal-to-noise ratio leading to exclusion during image analysis. It happens because the presence of fluorophores when performing nanoscale imaging on biological samples can influence the dynamics of molecular processes. Despite this, an affordable technique to control the fluorophore distribution within the sample, as well as the rise of unpredictable anomalous processes induced by the fluorophore itself, was missing until the coupling of STED nanoscopy with AFM to investigate the formation of amyloid aggregates. In particular, the in vitro aggregation of insulin and two alloforms of \(\beta \)-amyloid peptides has been studied, Fig. 50 Standard methods to induce the aggregation and to label the molecules at different dye to protein ratio have been used to a better understanding of a process having a dramatic impact on the development of neurodegenerative pathologies like Alzheimer’s, Parkinson’s and Huntington’s diseases. The result obtained was that only a fraction of the fibrillar aggregates could be displayed in STED images, indicating that the labelled molecules were not participating indistinctly to the aggregation process. This means that labelled molecules can follow selected and specific pathways of aggregation, among the multiple that are present in the aggregation reaction. This could lead to a misunderstanding of the optical nanoscopy data [170].

Fig. 50
figure 50

Amyloid fibrils from bovine insulin labelled with fluorescence dye ATTO 488 NHS ester with a 1:19 dye-to-protein ratio. The STED and confocal images are overlaid with AFM topography. The resolution of the STED microscopy image is significantly enhanced with respect to the confocal microscopy image. At the same time, the AFM provides the topographical image on the same sample area. Some fibrillar aggregates are not displayed in fluorescence microscopy (e.g., white arrow). Scale bar, \(1\,\upmu \)m

Fig. 51
figure 51

Single go measurement: multiple movements in one go. The figure shows STED images of 40 nm fluorescent beads. The colour channels plotted as red and green is the time axis, so that any movement will be coloured as a red particle that changed green. Panels from top to bottom show different types of movements assigned to the AFM (white arrow) with the help of the STED image as a map for targeting coordinates. These panels show different motions like separation, joining, dissecting, and erasing/kicking

Fig. 52
figure 52

Coupling force pulling with super-resolved imaging allows to map different stages of mechanical interactions controlled through the tip

6.3 Nanomanipulation

A further application of coupling forces and photons can result in nanomanipulation ability. The AFM has also been employed as a tool for nanomanipulation since the tip can indent, scratch, move, or push on a sample, inducing modification or stimulation, even to extremely localized areas. Here, Fig. 51 illustrates a few different targeted movements possible with STED AFM combination. So far, high precision from STED images can be applied in the manipulation for multiple movements in a go, i.e. AFM imaging is not necessary. In spite of this, one of the most challenging realizations for future developments is related to the capability, not only to drive the nanomanipulation but also to follow the effect induced by the AFM tip on the sample in real-time [171]. One can also move toward precise manipulation in a wet environment and on a biological sample. Figure 52 shows a sub-micrometric displacement of a single microtubule fibre induced by the AFM tip. The movement is recorded by STED at video rate while AFM is used to dissect a single microtubule filament [172]. The demonstration of the effectiveness of the coupling between AFM and optical nanoscopy in the study of biological samples opens an unexpected window for a new generation of integrated setups able to provide valuable correlative analysis on biosamples. We can state that the coupling between AFM and optical nanoscopy offers new capabilities that can be readily applied in the field of stimulus-response experiments working in the field of mechano-transduction, also related to the pathological affections of the function of mechanoreceptors. Very exciting is the idea to act on single cells and cell compartments with a nanoscalpel, being able to drive modifications at the single-molecule level on living materials.

Fig. 53
figure 53

Optical nanoscopy allow to decipher and visualize what our eyes and wide-field microscopes did not allow us to see. The challenging perspective lies in the chance of being quantitative; this is beyond the mere super-resolved image

7 Conclusions

Optical microscopy techniques based on the fluorescence mechanism of contrast are today turned into optical nanoscopy methods that demonstrated how the spatial resolution frontier is far beyond Abbe’s famous limit [4]. Figure 53 clearly shows how the different modalities used for embracing the route of super-resolution predicted by Giuliano Toraldo Di Francia work towards the very same goal. It is worth noting that optical nanoscopy also opened a new window on quantitative fluorescence microscopy and is posing a new class of questions related to the relationship between sample preparation and the formed image that is the one used to get decisions and conclusions based on optical microscopy data [173]. The need of a multimodal approach becomes evident since the light-matter interactions governing the process of forming an image produce a number of “invisible” messages that can be used for a new class of multi-messenger optical microscopes [164]. As an outstanding and challenging example, it is worthy of mention the multimodal coupling of fluorescence with polarization-based scattering. Control of the polarization of the light demonstrated the capability to image samples in a non-invasive way without using any fluorescent dye [174]. More specifically, it has been consolidated that, by analyzing the interaction between the left and right circular polarization states within an optically active sample, the emerging circular dichroism (CD) signal carries structural information at the single molecule level [175, 176]. CD is a diattenuation process, referred to differential transmission of light as function of its properties, involving circularly polarized light, i.e., the differential absorption of left- and right-handed light. When observed outside the absorption bands of the molecules [177, 178], we refer to it as Circular Intensity Differential Scattering (CIDS), typically originated by the presence of long-range chiral structures. CIDS is a purely diffraction phenomenon carrying an equivalent information to that given by CD but not related to a differential absoprtion. CIDS is originated from long-range chiral structures at a scale of 1/20th the wavelength of the incident light [179], clearly in the optical nanoscopy domain. CIDS emission is angularly dependent and is induced by the characteristics of chiral structures, such as their radius and pitch and the compaction of chiral groups [180]. The sharp control of polarization states at the excitation stage, and modulation detection allow to extract the weak scattering signal from the interaction with the sample at a high rate [181]. This has been recently turned into a multimodal approach combining fluorescence and scattering to map DNA compaction in biological cells [82]. Since the quantitative goal is related to understand the real chance of deciphering organizational motifs at the nanoscale in a crowded environment a further step is related to the modulation of key parameters like radius and pitch of DNA clusters taking advantage of the development in expansion microscopy. The combination of such expansion microscopy expansion techniques with optical nanoscopy methods, including STED [152], STORM [182] and SIM[183], demonstrated the access to finer details of several biological structures. So far, since when the sample buffer is expanded, the density of the chiral structures is reduced, the coupling between expansion and CIDS microscopy produces a global improvement in terms of spatial resolution and imaging contrast [153].

This is only the tip of an iceberg when thinking to merge the advanced optical methods available today in a kind of melting pot that has been named liquitopy, liquid tunable microscopy [164], Fig. 54.

Since “power is nothing without control” the optical access to large, multimodal quantities of data can revolutionize microscopy world again if we can harness the tools and technology to manage, store and analyse it. The best combination of tools and technology we have today is the one coming from the big data world. Data creation, storage, retrieval and analysis are challenging in the developments of artificial intelligence strategies to provide a structure to the immense quantity of available data today [184].

Fig. 54
figure 54

The liquitopy scenario as a new trend in optical microscopy [173]. FCS, fluorescence correlation spectroscopy; FRET, fluorescence resonance energy transfer; FRAP, fluorescence recovery after photobleaching; FLIM, fluorescence lifetime imaging microscopy; STORM, stochastic optical reconstruction microscopy; GSDIM, ground state depletion imaging microscopy; PALM, photoactivation localization microscopy; SPIM, selective plane illumination microscopy; IML-SPIM, individual molecule localization SPIM; STED, stimulated emission depletion; SPLIT-STED, separation of photons by lifetime tuning STED; 2PE, two-photon excitation; SHG, second harmonic generation; MPE, multi photon excitation; SW 2PE-STED , single wavelength 2PE-STED; ISM, image scanning microscopy; EM, electron microscopy; SPM, scanning probe microscopy

Fig. 55
figure 55

Working engine with input data modules expanded into a cripple correction layer that compensates for missing information in each layer of the engine. This “machine” is trained to take a decision within a deep-learning scenario. It is a constitutive element for the multi-messenger microscopy perspective [185]. NLP, natural language processing. Totipotent multimessenger microscope (TOTEM) refers to the microscope contining different mechanisms of contrast as outlined in Fig. 54

Now, the spatial resolution limit of the optical microscope lies in the signal to noise ratio available [186]. In other words, the lack of some information, some spatial frequencies in the information transmission channel, does not pose a fundamental limit for the attainable resolution [20]. Optical nanoscopy demonstrates that the missing spatial frequencies, although not detected when forming an optical image, can be extrapolated in different ways. Different ways include the use of cross-modalities that take advantage of the large amount of photon driven data available when priming light-matter interactions [20]. Today there is a comparatively new powerful tool that is flexible enough to be exploited in different scientific areas and for a variety of technological applications. It stems from artificial intelligence and is known as deep learning. Deep learning, also referred as deep structured learning, belongs to broader family of machine learning methods based on artificial neural networks with representation learning. The adjective “deep” comes from the use of multiple layers in the network. Few of the significant advances of deep learning are already visible in image recognition and speech recognition programs that are used in daily life [187]. The development of deep learning protocols has taken a foot in most software foundations through open source software licensing and availability of deep learning engines and pre-trained models. It has turned out that deep learning has the potential to untangle high-dimensional data with ease, unlike conventional methods. So far, the realisation of a machine learning approach able to process data stemming from imaging, literature, verbal inputs, multiparameter information and moving towards enforced supervised deep learning layers is the key to reach a logical and structural output and to exploit the potential of using label(s) + label-free data in an artificial intelligence context, i.e. based on (machine learning and deep learning. The working engine, reported in Fig. 55, has four input data. (i) A unifying engine (U) that bring different fluorescence-based methods, (ii) natural language processing engine (NLP) that can read and understand the user/client/doctor that can process all verbal instructions and comments on the case, including evolution or history of the subject. (iii) learning engine (L) which learns from the history of the case and other complementary microscopy (micro/nano sizes) imaging schemes. (iv) totem engine (T) that work in a multi scale fashion which can integrate different medical schemes and comments into a unit [188]. The deep learning scenario, shown in Fig. 55, is an indispensable component in any design and realization of the global architecture of a modern optical microscope, a multi-messenger microscope.