Paper The following article is Open access

Super-resolution optical fluctuation imaging—fundamental estimation theory perspective

and

Published 9 June 2021 © 2021 The Author(s). Published by IOP Publishing Ltd
, , Citation Stanisław Kurdziałek and Rafał Demkowicz-Dobrzański 2021 J. Opt. 23 075701 DOI 10.1088/2040-8986/ac059c

2040-8986/23/7/075701

Abstract

We provide a quantitative analysis of super-resolution imaging techniques which exploit temporal fluctuations of luminosity of the sources in order to beat the Rayleigh limit. We define an operationally justified resolution gain figure of merit, that allows us to connect the estimation theory concepts with the ones typically used in the imaging community, and derive fundamental resolution limits that scale at most as the fourth-root of the mean luminosity of the sources. We fine-tune and benchmark the performance of state-of-the-art methods, focusing on the cumulant-based image processing techniques (known under the common acronym stochastic optical fluctuation imaging), taking into account the impact of limited photon number and sampling time.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The wave nature of light imposes a limit on the resolution achievable by optical microscopes, known as the Rayleigh limit [1]. Nevertheless, over the past 40 years, many techniques, under the common name 'super-resolution imaging' [221], have been developed to bypass this limit. Almost all far-field super-resolution methods can be divided into three groups depending on the way in which assumptions laying behind the derivation of the traditional resolution limits are broken: (i) sample (light emitters) modification [29, 22], (ii) outgoing light measurement modification [1014] or (iii) illuminating light modification with a particular focus on the use of non-classical states of light [1520].

Methods (ii) and (iii) were largely developed by theorists and their fundamental potential and limitations are well understood in terms of quantitative concepts from (quantum) information and estimation theories. In particular, by studying basic two (or few) point-sources imaging scenarios, optimal resolving protocols have been designed, and rigorous upper-bounds on achievable resolution gains derived. Still, due to technical challenges, the practical impact of these methods is debated and the majority of experimental implementations are proof-of-principle demonstrations rather than versatile imaging systems.

In contrast, methods (i) have been largely developed by experimentalists, are commonly used in modern fluorescent microscopy, and are practical for imaging of 2D, or even 3D samples with an arbitrarily complex distribution of emitters. A significant portion of these methods make use of temporal correlations of intensity of each emitter. In methods such as stochastic optical fluctuation imaging (SOFI) [6], stochastic optical reconstruction microscopy (STORM) [4], and photo activated localization microscopy (PALM) [3], positive temporal correlations, explainable by a classical model of emitters with fluctuating brightness, are utilized. Negative, inherently quantum correlations (anti-bunching) can be used to obtain super-resolution as well [8, 22]. Despite their practical relevance, methods (i) have not been given as much estimation-theoretical attention as methods (ii) and (iii), see [23] for some notable exception. The goal of this paper is to fill in this gap by providing a comprehensive study of the so called SOFI [6]. The introduced framework can be, however, generalized to other super-resolution techniques, e.g. anti-bunching based method [8, 22], as we demonstrate in section 5.5.

One of the main challenges in approaching imaging problems using the estimation theory perspective is the complexity of the imaging task when viewed as a multiple-parameter estimation problem [24, 25]. As a result, an estimation based approach is usually restricted to rudimentary scenarios. The fundamental feature on which that estimation based studies focus on is the drop of precision of estimation of the distance between two identical point sources as the distance is comparable or goes below the Rayleigh's limit. Furthermore, the variance of any unbiased estimator of sources separation tends to infinity as the separation goes to zero. It was already argued [23, 26], that such a statistical phenomenon, called Rayleigh's curse [10], is directly connected to the concept of the resolution. Recently, the method which in principle shows no sign of the Rayleigh's curse has been proposed [10] and implemented [14]. Unfortunately, Rayleigh's curse eventually always reappears for small enough separations in all realistic scenarios, when the presence of noise is assumed [27]. In practice, super-resolution techniques (including the one studied in this paper) do not provide non-vanishing estimation precision for zero separation, but they allow to enhance the precision for sub-Rayleigh distances. The similar effect can be achieved by modifying the imaging system, such that its point spread function (PSF) gets narrowed. An important contribution of this letter is a proposal of the operationally meaningful quantity that relates the estimation precision enhancement achieved with a given super-resolution technique and the narrowing the PSF by an equivalent factor. This quantity, which will be defined further on in the text, allows to connect two different views on the resolution—the one based on the estimation theory, and the one related to the effective PSF size. This will also allow us to interpret the resolution limits obtained when studying the two-point separation problem, as a valid (but possibly not tight) limits for imaging more realistic multiple point sources.

Thanks to this connection, it is possible to properly account for the effects of noise, among which the most fundamental is the shot noise resulting from the finite detection statistics. The impact of shot noise is often far from obvious for more sophisticated algorithms of image reconstruction, and, as will be discussed below, cannot be ignored even when dealing with bright sources. Furthermore, when finite detection statistics is combined with the finite correlation times of fluctuating emitters, a non-trivial trade-off in the choice of the optimal sampling time arises—the longer time of a single frame, the better photon statistics, but at the same weaker inter-frame intensity fluctuations.

2. Estimation theory for optical imaging

In this section we demonstrate how estimation theory tools can be used to provide a meaningful resolution gain figure of merit, which encompasses the effects of noise and the effective PSF size. We start with a brief review of the ideas laying behind the superresolving power of the SOFI technique to illustrate the traditional approach to quantifying super-resolution.

2.1. Basics of SOFI

The SOFI method is based on calculating temporal cumulants of measured intensity distribution in a number of time frames. It's often claimed, that the resolution can be increased by a factor $\sqrt{k}$ if the kth cumulant is computed, which is justified as follows. If the imaged sample consists of L independently fluctuating point emitters, then the light intensity observed in the image plane is:

Equation (1)

where the stochastic process Pi (t) represents the fluctuating brightness of ith emitter, $\vec{r_i}$ its position in the image plane, and $U(\vec r)$ is the PSF of the system. Since the emitters are independent, kth temporal cumulant of the signal (at a given $\vec r$) reads:

Equation (2)

where $\kappa_k[P_i(t)]$ is the kth cumulant of the stochastic process Pi (t). The PSF is now replaced by its kth power. If the standard, Gaussian approximation of the PSF is used, Uk is narrowed by a factor $\sqrt k$ compared with U.

The described scheme can be improved in various ways. The most important modification is based on utilizing spatial correlations (cross-cumulants) in an image reconstruction algorithm [28]. Cross-cumulants of kth order are computed for all k-element subsets of all pixels, and the cumulant computed for a given set gives rise to the signal located in its centroid. Cross-cumulants corresponding to the same centroid can be summed directly, or with proper weights in order to maximize the signal-to-noise ratio (SNR) [29]. This approach not only allows to make use of information hidden in the spatial correlation, but also increases the number of pixels in the final image, which is significant from the practical point of view. The latter effect can be also achieved if recorded images are Fourier-interpolated before further processing [30].

Unfortunately, even after applying the described modifications, higher cumulants are more noisy, and it is not possible to achieve the unlimited resolution gain in practice—this effect reappears in all known experiments. Therefore, noise has to be taken into account in order to assess the maximal resolution gain achievable in SOFI. Some analysis of the impact of noise on the computed cumulants estimators have been made [3133], but these studies have not employed estimation theory concepts such as the Fisher information (FI), and did not make an attempt to benchmark the performance of the methods against the fundamental limitations imposed by estimation theory. Our goal is to provide such a rigorous study.

2.2. Resolution gain limit

Let's consider the simplest, yet representative case of imaging a binary object, which consists of two identical point emitters with fluctuating brightness. Those two emitters are assumed to lie on a known axis, perpendicular to the optical axis of the imaging system, so the problem becomes 1D, and only transverse resolution is studied. Moreover, we assume that the centroid of the object is also known, and only the distance between emitters (θ) needs to be estimated, see figure 1 (note that in case of emitters of different brightness all the reasoning will be basically unaltered provided one replaces the geometric distance between the sources by a quantity based on the second moment of intensity distribution, as this is the quantity that is subject to the Rayleigh curse [10, 24, 34]). Given a random vector $\boldsymbol{N}$ that represents the data, distributed according to a probability distribution which is a function of the estimated parameter $p_\theta(\boldsymbol{N})$, the variance $ \textrm{Var} [ \tilde \theta]$ of any locally unbiased estimator of θ is lower bounded by $(\mathcal F_\textrm{meas})^{-1}$, where

Equation (3)

is the FI associated with the whole measurement [35]. For the purpose of comparing different strategies we will use the FI per photon $\mathcal{F}(\theta) = \mathcal{F}_\textrm{meas}(\theta)/\bar{N}$, where $\bar{N}$ is the mean number of photons involved in the experiment.

Figure 1.

Figure 1. Imaging model and the overview of the main results. (a) Two point sources stochastically switching between two luminosity levels. (b) Imaging task reduced to estimation of two point sources separation in the regime of overlapping point-spread-functions. By exploiting the full information $\boldsymbol{N}_{\textrm{full}}$ of number of photons ni,m registered in a given pixel and in a given time-frame of duration τ, it is possible to provide an effective enhancement in resolution compared with the standard imaging where the numbers of photons measured in different time-frames are summed. (c) Conceptual representation of different reconstruction methods (utilizing incomplete data $\boldsymbol{N}$) and the corresponding resolution gain limits ζ: M (mean intensity) = SI (standard imaging), M+AC2 (mean + second temporal auto-cumulant analysis), M + XC2 (mean + cross-cumulant analysis).

Standard image High-resolution image

From now on, we assume that the PSF is Gaussian with standard deviation σ. The FI per one photon for standard imaging (SI) of Poissonian sources with constant brightness [10, 23], $\mathcal F^\textrm{(SI)}$, is sketched as a function of θ in figure 2. A significant drop in estimation precision below the Rayleigh limit is visible. In super-resolution microscopy we are mostly interested in the sub-Rayleigh regime, i.e. we assume that $\theta \ll \sigma$. If no noise apart from shot noise is present, and the effect of finite spatial resolution of the detector is neglected, $\mathcal F^\textrm{(SI)}$ for small θ can be approximated as (see appendix A.1 for details)

Equation (4)

where we have also indicated that SI is equivalent to the analysis based only on the mean of the total number of photons (M) collected over the whole duration of the experiment. Now, if the PSF is narrowed by a factor s, the FI in the limit $\theta \rightarrow 0$ increases by a factor s4—this observation allows us to connect PSF-size approach with an estimation theory approach. If a given super-resolution imaging scheme leads to an increase of FI from $\mathcal F^\textrm{(SI)}(\theta)$ to $\mathcal F(\theta)$, then for $\theta \ll \sigma$ this change is equivalent to narrowing of the PSF by a factor

Equation (5)

The factor ζ will be called the resolution gain limit (RGL) and will serve us as a figure of merit to asses the performance of different super-resolution methods.

Figure 2.

Figure 2. Comparison of the FI associated with the mean intensity analysis (M), the FI for the estimation scheme involving the mean and the 2nd temporal cumulant (M + AC2) and the FI based on the 2nd auto-cumulant only (AC2). For small distance θ, very bright emitters, and strong brightness fluctuations, improvements are equivalent to narrowing of the PSF by $\zeta = \sqrt[4]{2}$. Pixel size Δx = 0.5 σ is assumed.

Standard image High-resolution image

Admittedly, it does not contain all the information on the performance of a given super-resolving technique, as it ignores the behaviour for larger θ (see figure 2), and the behaviour of the technique for more complicated objects. Nevertheless, this quantity captures in a simple way the essence of the super-resolving potential in a basic two point sources model, and allows to compare different methods in a well-defined way. Moreover, this quantity also provides us with a meaningful upper bound on the performance of a method in more complex imaging scenarios, as discrimination of binary objects is a prerequisite for resolving multiple sources. If a given super-resolving method yields a multiple-source image that can be equivalently regarded as the one obtained with a SI system with an appropriately narrowed PSF, then clearly when focusing on any two neighbouring points we can bound the performance of this method by our limit based on the simplified two-point imaging model. This limit may not be tight, but interestingly, even such an optimistic limit turns out to be lower than the resolution gain predicted by the naive PSF size analysis in some cases. It is worth pointing out, that in all cases examined in this work, the ratio $\left(\mathcal{F}(\theta) /\mathcal{F}^\textrm{(SI)}(\theta) \right)^{1/4}$ decreases with increasing θ. This implies that ζ remains a valid bound on the resolution gain enhancement irrespectively of the imaged points separation, and hence can be interpreted as a factor that reveals the maximal potential enhancement of a super-resolving method considered. Finally, maximization of ζ in a given protocol can be regarded as a rule of thumb prescription on the choice of parameters that is likely to lead to the optimal performance of the protocol in real-life scenarios.

3. The RGL for 2nd order SOFI

In this section, we will introduce a realistic model of the emitters used in typical SOFI experiments. The results based on this model will be limited, due to its complexity, to image reconstruction methods based solely on 1st and 2nd order intensity correlations. The simplified model, which allows to extend the reasoning for higher-order intensity correlations, will be introduced in the next section.

For the binary source considered, we fix the positions of the emitters to be −θ/2 and θ/2. Both emitters are statistically identical and independent. Fluctuations of a single emitter brightness are described by a stationary Markov process with two possible relative brightness levels $q_\textrm{on}$ and $q_\textrm{off}$ satisfying $q_\textrm{on}+q_\textrm{off} = 1$, and $0 \leqslant q_\textrm{off} \leqslant q_\textrm{on} $. Such a description leads to exponential distributions for the occupation time of two states, which is observed for many typical dyes [36], and can be used to approximate the QDs power-law blinking [37]. Two states have lifetimes equal to $\tau_\textrm{on}$ and $\tau_\textrm{off}$ respectively—in the examples studied we will set $\tau_{\textrm{on}} = \tau_{\textrm{off}} = \tau_0$ (some results for $\tau_\textrm{on} \ne \tau_\textrm{off}$ are shown in section 5.3) and τ0 will play the role of an effective unit of time. The number of photons emitted from a single source over a short time δt, for which the relative brightness qi may be assumed to be fixed, is described by a Poisson distribution with mean $\bar P q_i \delta t$, where $\bar P$ parameterizes (in units $\tau_0^{-1}$) the average emitter brightness. Light is detected using a camera with a pixel size Δx with the total number of pixels $M_{\textrm{pix}}$. No noise apart from shot noise is considered (additional background noise is analysed in section 5.2). In the analysed method it is crucial to track the time dependence of the light intensity, so the whole detection time is divided into $M_\textrm{fr}$ intervals of duration τ, hereinafter called frames. In the end, one obtains a number of photons in each pixel and in each time frame ni,m , where $i \in \{1,\ldots,M_\textrm{pix}\}$ and $m \in \{1,\ldots,M_\textrm{fr}\}$ stand for the pixel and the frame label respectively. In principle, θ may now be estimated from raw data $\boldsymbol{N}_{\textrm{full}}$ containing all ni,m . At this point, however, we would like to consider scenarios in which particular algorithms of data analysis are used. We therefore construct a random vector $\boldsymbol{N}$ which contains combinations of variables ni,m which are used in a given θ estimation procedure. Given the probability distribution family $p_\theta(\boldsymbol{N})$, $\mathcal F$ can be computed using (3).

Let's restrict our considerations to vectors $\boldsymbol{N}$ of the form:

Equation (6)

where $\boldsymbol{v}_m = \left[v_{1,m}, \ldots, v_{n,m} \right]^T$ depends on variables $n_{1,m},n_{2,m},\ldots,n_{M_{pix},m}$ only, in the same way for each frame. The simplest possible choice, $\boldsymbol{v}_m = \left[n_{1,m}, \ldots, n_{M_\textrm{pix},m} \right]^T$, corresponds to the SI approach, in which only the mean (M) value of signal is taken into account. To take advantage of fluctuations it is necessary to extend $\boldsymbol{N}$. If we choose $\boldsymbol{v}_m$, which consists of elements $n_{i,m}^k$ for $i \in \left\{1,\ldots,M_\textrm{pix} \right\}$, $k \in \left\{1,2,\ldots,K \right\}$, it is possible to construct estimators based on the first K auto-cumulants of the signal in each pixel (M + AC2 + ... + ACK), as well as compute the associated FI. This formalism also allows us to compute $\mathcal F$ when we restrict ourselves to the use of 2nd auto-cumulant only (AC2), as in the basic SOFI scheme (see appendix A.3 for details).

It's known, that the quality of the image in SOFI can be improved if the correlations between different pixels are utilized. In order to study the efficiency of these class of strategies, consider vector $\boldsymbol{v}_m$ comprising elements $ \left\{\{n_{i,m} \}, \{n_{i,m} n_{j,m} \} \right\}$ for $i,j \in \{1,2,\ldots,M_\textrm{pix} \} $, which allows to compute a covariance estimator for each pixel pair (M + XC2). Note, however, that in a commonly used cross-cumulant based approach of image reconstruction in SOFI, one does not use each covariance independently. Instead, the covariances of pairs with the same centroid are summed, and such a sum is treated as a signal located at the given centroid [28] (M + XC2s). In order to investigate, how much information is lost in such a summation, we will also compute $\mathcal{F}$ corresponding to $\boldsymbol{v}_m = \left[ S_{1,m}, S_{3/2,m}, \ldots, S_{M_\textrm{pix},m},n_{1,m}, \ldots, n_{M_\textrm{pix},m} \right]^T$, where $S_{l,m} = \sum_{(i+j)/2 = l} n_{i,m} n_{j,m}$.

Note, that the elements of $\boldsymbol{N}$ are in general correlated in a very non-trivial way. However, things simplify in the limit $M_\textrm{fr} \rightarrow \infty$, as we can use the extended version of the central limit theorem [38] (valid in our case, when temporal correlations decay exponentially in time) to conclude that $\boldsymbol{N}$ is normally distributed. Consequently, the FI per photon can be computed using the formula involving the mean value $\boldsymbol{\mu}$ of the distribution and its covariance matrix $\boldsymbol{\Sigma}$ only [35]

Equation (7)

which is valid for $M_\textrm{fr} \rightarrow \infty$. $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ are computed with the help of the following formulas:

Equation (8)

Equation (9)

where the homogeneity of the Markov processes was used. Notice, that correlations between frames affect $\boldsymbol{\Sigma}$, and therefore have impact on ζ, even though statistics associated with these correlations are not directly used in the estimation scheme. See appendix A.3 for more details of $\boldsymbol{\Sigma}$ and $\boldsymbol{\mu} $ computation.

We are now ready to compute $\mathcal F (\theta)$ for different estimation schemes, and check how the RGL defined in (5) depends on the parameters of the setup. The way in which the RGL depends on the time of a single frame τ is particularly interesting. If τ is very long ($\tau \gg \tau_\textrm{on}, \tau_\textrm{off}$), then the fluctuations become averaged inside each frame, and can be hardly observed. On the other hand, when τ is too short, information contained in correlations between subsequent frames is lost. In the extreme case in which one photon is detected in a single frame at most, higher cumulants do not provide any extra information compared with the mean value of the signal. Detailed calculations confirm, that $\zeta \rightarrow 1$ in the limit $\tau \rightarrow 0$ and $\tau \rightarrow \infty$, both in the case of auto-cumulant and cross-cumulant based estimation. In order to reach the optimal ζ one needs to avoid both extremes and identify the optimal value of τ, which in general depends on the estimation method and the emitters brightness, see figure 3. Note that the cross-cumulant methods tend to benefit from longer frames, which allow to collect more photons and effectively reduce the shot noise of the data, while the auto-cumulant method, with its reduced data complexity, favours shorter frames and as a result stronger effective brightness fluctuations.

Figure 3.

Figure 3. (a) Dependence of RGLs (ζ) on the frame time τ (for $\bar P = 300~\tau_0^{-1}$). (b) Dependence of the optimal time frame $\tau_\textrm{opt}$, for which ζ is maximal, on emitters brightness $\bar P$. Parameters used: $\tau_\textrm{on} = \tau_\textrm{off} = \tau_0$, $q_\textrm{off} = 0$, $q_\textrm{on} = 1$, Δx = 0.5 σ.

Standard image High-resolution image

Fixing the optimal frame time $\tau = \tau_{\textrm{opt}}$, the dependence of RGLs on emitters brightness $\bar P$ and fluctuation strength defined as $\alpha = 1-q_\textrm{off}/q_\textrm{on}$ is shown in figure 4. The cross-correlation based approach outperforms the auto-cumulant based estimation significantly. Moreover, the relevant part of information is lost if the summation of covariances for pairs with the same centroid is carried out, as in [28]. The use of a properly weighted sum, as proposed in [29], allows to increase the RGL only slightly, as we demonstrate in section 5.4. This indicates a space for improvement in the application of cross-cumulant based methods.

Figure 4.

Figure 4.  ζ as a function of α (for $\bar P = 500\,\tau_0^{-1}$) (a), and as a function of $\bar P$ (for α = 1) (b), for different estimation schemes. Black lines correspond to results obtained using realistic blinking model, with $\tau_\textrm{on} = \tau_\textrm{off} = \tau_0$ and $\tau = \tau_\textrm{opt}$. Results obtained with the help of the simplified model (p = 0.5, τ = τ0) are denoted by grey lines. Pixel size: Δx = 0.5 σ.

Standard image High-resolution image

4. Simplified fluctuations model and maximal RGL

Until now, we have focused on estimation schemes based on 2nd order correlations. Going beyond this approach, we want to establish the fundamental upper-bound on the RGL, $\zeta_\textrm{max}$, which does not depend on the estimation scheme. To do so, we should compute the FI for a model involving all the data ni,m , and moreover, allow both the temporal and spatial resolution of the detector to be unlimited. This task is computationally much more challenging than the previous one, so in what follows we consider a simplified model of fluctuating sources.

Previously, the intervals between subsequent state switches were irregular. Therefore, brightness changes were observed within individual frames, and frames were correlated. From now on, we are going to neglect both of these effects, and assume, that brightness of both emitters are drawn in each frame independently, and remain constant in each frame taking values $q_\textrm{off}$, $q_\textrm{on}$ with probabilities p, 1 − p respectively. The number of photons emitted during a single frame from a source with a relative brightness qi is drawn from the Poisson distribution with mean $q_i \bar P \tau$.

Let us now check how the described simplification affects our previous results in the particular case in which the on- and off-states are equally probable. It corresponds to p = 0.5 in the simplified model, and to $\tau_\textrm{on} = \tau_\textrm{off}$ in the realistic one. We choose our parameters such that the average blinking frequency is the same in both models, i.e. the frame time in the simplified model is equal to emitters lifetimes in the Markov-process-based model ($\tau = \tau_\textrm{on} = \tau_\textrm{off}$). Frame time in the realistic model is assumed to be optimal $\tau = \tau_\textrm{opt}$. As can be seen in figure 4, the simplified model tends to overestimate ζ, but qualitatively the dependence of ζ on different parameters as well as ordering of different methods in terms of their performance is unaffected.

Unlimited spatial resolution of the detector means, that our complete data from each frame comprises the list of all the detected photon positions $x_1,\ldots,x_n$ (detection times do not provide any extra information in the model). Let's assume for a moment that relative brightness of two emitters (q1,q2) are fixed, and the total number of photons in a frame n is known a priori. Then the conditional probability of measuring a given photon positions sequence $x_1, \ldots,x_n$ is

Equation (10)

where

Equation (11)

The above formulas reflect the fact, that subsequent photons positions are uncorrelated if brightness are fixed, and the probability that a given detected photon was emitted from a given source is proportional to its brightness. In reality, one does not have a direct access to relative brightness values $q_1, q_2$. The observed probability density function (PDF) is averaged over unknown brightness:

Equation (12)

where the conditional probability $P(q_1, q_2|n)$ is calculated using Bayes' formula:

Equation (13)

Let's notice that photons positions drawn from the PDF (12) are correlated within a single frame—correlations arise when the information about brightness is hidden. The information about the total number of photons per frame n is of course available, so it is possible to calculate the FI per frame for each fixed n separately ($\mathcal F_{(n)}$), and then compute the FI per one photon using the formula

Equation (14)

where $\left\langle X(n) \right\rangle _n \equiv \sum_n X(n) P(n)$ denotes averaging over n. The above procedure is used to obtain the expression for the complete-data-based FI (see appendix A.2 for details). The corresponding RGL reads

Equation (15)

where $\alpha = 1-q_\textrm{off}/q_\textrm{on}$ is the fluctuation strength, and $G(p,\alpha,\bar P \tau)$ is explicitly defined in (A.29). For our analysis, it is crucial that function G is ascending with respect to $\bar P \tau$ and is upper-bounded by an expression which does not depend on $\bar P \tau$

Equation (16)

Numerical computations show, that the above limit approximates G with an accuracy better than 1% for $\bar P \tau \gtrsim 2500$ in the case of weak fluctuations (α = 0.2), and $\bar P \tau \gtrsim 50$ for strong fluctuations (α = 1) (see figure A1), in which case the RGL can be approximated as:

Equation (17)

From (15) and (16), we see that $\zeta_\textrm{max} \sim (\bar P \tau) ^{1/4}$ for large number of photons per frame $\bar P \tau$ for different fluctuation parameters. A similar scaling is observed numerically for ζ associated with cross-cumulant based estimation, see figure 5. If, however, only the mean and the 2nd auto-cumulant of the signal is involved in the estimation scheme, $\zeta(\bar P \tau = \infty) $ is finite. In particular, if we restrict our considerations to the symmetric case p = 0.5 (other p values are discussed in section 5.3), no RGL higher than $\sqrt[4]{2}$ can be achieved. This demonstrates, that the resolution gain $\sqrt{2}$ predicted by the PSF narrowing analysis cannot be achieved even for strong fluctuations, very bright sources and large number of frames. The SI scheme with a PSF narrowed by a factor larger than $\sqrt[4]{2}$ will always outperform 2nd auto-cumulant based SOFI if the basic task of resolving two point sources is considered. Interestingly, a similar discrepancy (the RGL is $\sqrt[4]{2}$, not $\sqrt{2}$) can be observed for a simple case of anti-bunching based imaging, as we will demonstrate in the next section.

Figure 5.

Figure 5. (a) ζ in the limit $\bar P \rightarrow \infty$ for the 2nd auto-cumulant based methods as a function of α for p = 1/2 (calculated for the simplified model). RGL is never larger than $\sqrt[4]{2}$. Even for large $\bar P$, the replacement of mean with 2nd cumulant is not advantageous, unless fluctuations strength α is large enough ($\zeta \lt 1$ for $\alpha \lt 0.83$). (b) Fundamental upper-bound $\zeta_\textrm{max}$ as well as RGLs associated with utilizing cross-cumulants, rescaled by $(P \tau_0)^{-1/4}$ to indicate the asymptotic behaviour when $\bar P \rightarrow \infty$.

Standard image High-resolution image

5. Practical aspects and extensions of the model

5.1. The role of a pixel size

In the definition of ζ (5), the same pixel size is used to calculate $\mathcal F$ and $\mathcal F^\textrm{(SI)}$. In order to examine the impact of the pixel size for different methods, we will use a slightly modified figure of merit defined as:

Equation (18)

This modification fixes the denominator to $\mathcal F^\textrm{(SI)}$ associated with the infinite spatial resolution of the detector. It allows us to observe, for example, how the SI resolution decreases when Δx is too large. The role of the pixel size becomes less trivial when auto-cumulants are used in the estimation (see figure 6). Pixels cannot be of course too large, but very small pixels are no longer the optimal choice because the information contained in the correlations between pixels is lost. In particular, higher auto-cumulants do not provide any extra information if one photon per frame per pixel is detected at most. The described problem disappears when cross-cumulants are used, and very small pixels again become advantageous. Notice the fact, that a similar trade-off is observed in the time domain, when one changes the frame time. Very short time frames would only be optimal if correlations between frames were used, but such schemes are not analyzed in this work.

Figure 6.

Figure 6. Figure (a) shows the dependence of $\zeta^\textrm{(pix)}$ on the pixel size Δx for auto-cumulant based estimation schemes. If we go beyond the standard approach (M), infinitely small pixels are not optimal. Figure (b) shows how $\zeta^\textrm{(pix)}$ is affected when both Δx and τ changes for 2nd auto-cumulant based estimation (M + AC2). In (a) the simplified model was used (p = 0.5, $\bar P \tau = 1000$), whereas for (b) Markov process based model with $\tau_\textrm{on} = \tau_\textrm{off} = \tau_0$, and $\bar P = 1000~\tau_0^{-1}$ was applied. In both cases α = 0.9.

Standard image High-resolution image

5.2. Extra background noise

Even though the shot noise, resulting from the finite detection statistics, is the most fundamental one, other types of noise (e.g. camera noise) often play a significant role in SOFI. Let us study a simple model, in which background noise is Poissonian, uncorrelated, and its mean value µB is the same in each pixel and in each frame. In order to take such a noise into account, it is enough to repeat the whole reasoning described in section 3 with only one modification—the term associated with the background noise should be added to each random variable ni,m (see appendix A.3 for more details).

Non-zero value of µB leads to ζ decrease in all examined cases (where SI without background noise is treated as a reference when ζ is computed). It turns out that cumulants based methods are more robust against this type of noise than the SI. As we show in figure 7 , the relative decrease of ζ due to the background noise is the highest, when only mean signal is used in the estimation. This result was to be expected, as the authors of SOFI method claim, that the background noise reduction is its important feature [6]. Nevertheless, the background noise affects the performance of each of the analyzed methods.

Figure 7.

Figure 7. The relative decrease of the RGL, $\delta \zeta = \frac{\zeta(\mu_B) - \zeta(\mu_B = 0)}{\zeta(\mu_B = 0)}$ is plotted as a function of µB —mean background noise per pixel per frame. The simplified (independent frames) model is used, with parameters Δx = 0.5 σ, $\bar{n} = 1000$, p = 0.5, α = 1.

Standard image High-resolution image

5.3. Non-equal states probabilities

So far, we have presented results for $\tau_\textrm{on} = \tau_\textrm{off}$ (realistic model) or p = 1/2 (simplified model). However, it is known that real emitters sometimes break this assumption, and favour one of the states. One can observe, that if off- state is more probable, the RGL becomes higher because more photons are emitted within frames in which two sources have different brightness. One should also take this asymmetry into account while dealing with optimizing the time frame for different estimation schemes—see figure 8.

Figure 8.

Figure 8. The RGL in the limit of infinitely bright sources and strong fluctuations (α = 1) as a function of $p = P(q_\textrm{off})$ is sketched in (a) (the simplified blinking model is used). One can observe, that estimation schemes based on (AC2) and (M + AC2) are only equivalent for p = 0.5. For $p\gt 0.5$ the RGL can be larger than $\sqrt[4]{2}$, but the (AC2) scheme never allows to beat $\zeta\lt \sqrt{2}$ limit. The RGLs computed for the realistic model are sketched in (b). Grey lines denote $\tau_\textrm{off} = 0.4~\tau_0$, $\tau_\textrm{on} = 1.6~\tau_0$ case, whereas $\tau_\textrm{off} = 1.6~\tau_0$, $\tau_\textrm{on} = 0.4~\tau_0$ for black lines, and α = 1 for both cases. A significant increase of ζ is possible, when off-state is favoured.

Standard image High-resolution image

5.4. Weighted summation of cross-cumulants

Let us examine the performance of an improved version of the image reconstruction scheme based on summing of the covariances of pixel pairs with the same centroid (M + XC2s). The authors of [29] propose to add weights to the summation procedure in order to maximize the SNR of the reconstructed image. Let $\kappa_1, \kappa_2,\ldots,\kappa_k$ be the covariances contributing to the same centroid. The signal located in this centroid is equal to $S = \kappa_1 + \kappa_2 + \cdots +\kappa_k$, when the basic scheme is considered. More generally, one can use the formula $S = w_1~\kappa_1 + w_2~\kappa_2 +\cdots+w_k \kappa_k$, where wi are some arbitrary weights. As shown in [29], in order to maximize the $\textrm{SNR} = \frac{\left\langle S \right\rangle }{\sqrt{\textrm{Var}(S)}}$, one should choose wi satisfying the following conditions:

Equation (19)

where m ∈ {2, 3, ..., k}, and $A_{ij} = \textrm{cov}(\kappa_i, \kappa_j)$. In reality, one obviously has no access to the exact values of κi and Aij , so their estimators must be inserted into (19) in order to obtain the optimal weights. However, in the limit of large number of collected frames, the difference between the actual values and the estimators does not affect the weights wi significantly. We will therefore assume, that the exact values of κi and Aij are used in the weights computing procedure.

We can assume that w1 = 1 without the loss of generality, as multiplying each weight by the same factor does not change the information content of the computed linear combination (in particular, the SNR remains the same). It's then straightforward to compute the rest of weights, by solving the set of equations (19). Then, the procedure for calculating $\boldsymbol{\mu}$, $\boldsymbol{\Sigma}$ , and afterwards $\mathcal F$ and ζ, is the same as the one described in appendix A.3—one only needs to replace the sum of the covariances with the proper linear combination.

As we show in figure 9, the improved procedure (M + XC2w) allows only a slight increase of ζ compared to the previously examined summation scheme (M + XC2s). The limit dictated by the approach in which covariances are treated as independent observables (M + XC2) is still far from being achieved. We therefore claim, that the problem of finding the optimal way to reconstruct the image using covariances, still remains open. It's partially due to the fact, that the SNR itself does not provide a full description of noise, when noise is correlated, as in the analyzed schemes.

Figure 9.

Figure 9.  ζ as a function of $\bar{n} = \bar P \tau$ (for α = 1) (a), and as a function of α (for $\bar{n} = 1000$) (b), for different estimation schemes. The improved scheme, based on weighted sums of covariances (M + XC2w), outperforms the basic scheme (M + XC2s) only slightly. The simplified model (independent frames) was used, with parameters p = 0.5, Δx = 0.5 σ.

Standard image High-resolution image

5.5. Non-classical sources and 3D imaging

The photon number fluctuations utilized in SOFI are described by super-Poissonian distribution, and therefore can be explained with the help of semi-classical models. However, sub-Poissonian photon number distributions can be used to obtain super-resolution as well. As it is believed (and justified by the effective PSF analysis) [39], the resolution can be increased by a factor $\sqrt 2$ if two emitters are imaged, two-photon frames are observed, and due to anti-bunching phenomenon one can be sure, that at most one photon can be emitted from a single source within a single frame. Let's challenge this statement, and compute ζ for the described case and check how the fundamental shot noise affects the possibility of obtaining super-resolution. The PDF of measuring a frame consisting of photons at position $x_1, x_2$ is

Equation (20)

After substituting the Gaussian form of U, expanding $p_\theta(x_1,x_2)$ into series around θ = 0, and using (3), we obtain the formula for the two-photon frame FI:

Equation (21)

The FI per one photon in this case is equal to $\mathcal F = \frac{1}{2} \mathcal F_{(2)}$, which after using (5) leads to $\zeta = \sqrt[4]{2}$. Despite optimistic assumptions (no noise apart from shot noise, only two-photon frames), our RGL again turns out to be lower than the resolution gain predicted by the PSF analysis.

The RGL introduced in this paper only applies to the transverse resolution. It is however known, that SOFI is a 3D method, and as such, allows to improve axial resolution as well. Nevertheless, the most important aspects of SOFI, which we want to study (i.e. how noise affects obtainable resolution) are all well illustrated with our model, in which both sources lie precisely in the image plane. To extend the whole reasoning to the 3D case, one should proceed analogously as in [40]. The information about the axial separation between the sources is encoded in the size of resulting PSFs, which are affected by the deviation from the image plane. When the deviation is very small, the FI associated with its estimation vanishes because the size of the PSF changes slowly with the axial displacement in this region. As with transverse resolution, the fluctuations will not result in a non-vanishing FI, but will rather result in a larger values of FI for small, yet non-zero deviations.

6. Summary

To summarize, we have provided a quantitative approach based on estimation theory, to compute performance limits on super-resolution imaging methods that utilize sources brightness fluctuations. By focusing on the rudimentary problem of resolving two point sources, we were able to provide a single meaningful quantity that allows to compare resolution gain of different methods and identify the optimal detection frame time. The study, has on one hand identified new fundamental limitations of some of the methods (e.g. 2nd auto-cumulant method), as well as indicated space for improvement of other methods (e.g. cross-cumulant based methods). Even though the study was based on a two point sources imaging problem, the obtained resolution limits can be regarded as valid, but possibly not tight, limits also in more complex multiple sources imaging case. In order to obtain tighter bounds, a more detailed quantitative study would be required, invoking the concepts of multi-parameter estimation theory. There are a number of ways how to phrase a complex imaging model as a multi-parameter estimation problem, and apart from the most obvious pixel by pixel image parametrization, may include focusing on moments of spatial intensity distribution [24, 34] or its Fourier components [41]. Generalization of such studies to the case of sources with fluctuating brightness, while non-trivial, seems possible and may provide a further insight into the potential as well as the limitations of SOFI and related methods.

Acknowledgment

We thank Konrad Banaszek for fruitful discussions. We acknowledge support from the National Science Center (Poland) Grant No. 2016/22/E/ST2/00559.

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

Appendix A.: Details of Fisher information computations

A detailed derivation of the FI associated with the estimation of the distance between two point sources in different scenarios will be provided in this section. We will start with the simplest, well known case of non-fluctuating Poissonian sources to justify (4) from the main text. Afterwards, intensity fluctuations will be added to our scheme. Simplified fluctuations model, in which subsequent frames are independent will be examined. We are going to derive a very general formula for FI, which is suitable for different types of intensity fluctuations, not only for two-level emitters presented in the main text. Then, adequate simplifications will be made to obtain the formula for $\zeta_\textrm{max}$ ((15)–(17)). In the last part of this section, calculations of the FI associated with different cumulant based algorithms, for both simplified, and realistic Markov-process based model, will be described.

Consider two point emitters placed at −θ/2 and θ/2. The PSF of the imaging system is assumed to be Gaussian with a standard deviation σ:

Equation (A.1)

Our goal is to compute the FI per one photon $\mathcal F (\theta)$. To do so, one needs to compute the FI for the whole measurement $\mathcal F_\textrm{meas}(\theta)$ and then divide it by the average total number of photons. Our task becomes slightly easier if the whole measurement output can be divided into independent, identically distributed parts (e.g. intensities measured in different, independent frames). It's then enough to compute the FI associated with only one of such independent parts because FI is additive for independent random variables.

A.1. Non-fluctuating emitters

This case is particularly easy because subsequent photons are not correlated, and the FI per one photon can be calculated directly. Sources are equally bright, and the spatial resolution of the detector is infinite. Each photon position x is independently drawn from the PDF:

Equation (A.2)

Now the FI can be computed with the help of (3) in which vector $\boldsymbol{N}$ consists of just one element—a detected photon position x. The $\mathcal F$ can be therefore expressed as an integral

Equation (A.3)

which after substituting the Gaussian form of U(x) simplifies to

Equation (A.4)

To obtain the analytical form of the above integral for $\theta \ll \sigma$, one can expand the integrated function in the series around θ = 0, and perform the integration term by term to conclude that

Equation (A.5)

which is consistent with (4). In order to obtain the values of $\mathcal F(\theta)$ for larger θ, the introduced integral must be calculated numerically. To compute $\mathcal F$ in case of non-zero pixel size Δx, one needs to construct vector $\boldsymbol{N}$ which consists of mean values of the signal in different pixels only, and then proceed as in appendix A.3.

A.2. Fluctuating emitters, independent frames

For the rest of this section, the assumption σ = 1 will be made. Let's consider the simplified model of fluctuations which is slightly more general than the one described in the main text. In each independent frame relative brightness of emitters placed at −θ/2 and θ/2, denoted by q1 and q2 respectively, is independently drawn from the same arbitrary probability distribution P(qi ). The frame time and the mean emitters power are denoted by τ and $\bar P$ respectively—for the sake of simplicity we are going to use the quantity $\bar{n} = \bar P \tau$, which is the only relevant quantity as long as frames are independent, and is proportional to the mean number of photons detected per frame.

As mentioned in the main text, in order to compute the FI per one photon $\mathcal F$, we compute the FI for each fixed number of photons in a frame separately ($\mathcal F_{(n)}$), and then use the formula

Equation (A.6)

$\mathcal F_{(n)}$ is the FI associated with the conditional PDF

Equation (A.7)

Notice, that the above equation is a generalised form of (12). Equations (10), (11) and (13) are still valid in the general case, and can be used to compute conditional probabilities $p_\theta(x_1,\ldots,x_n|q_1,q_2,n)$ and $p_\theta(x_1,\ldots,x_n|n)$. We are now going to find an explicit expression for $p_\theta(x_1,\ldots,x_n|n)$ to compute $\mathcal F_{(n)}$ directly from the definition of FI. To do so, let us begin with inserting (11) and (A.1) into (10). Before performing the product in (10), we expand each factor into series around θ = 0. After keeping only the leading terms, we obtain

Equation (A.8)

where

Equation (A.9)

Equation (A.10)

and the quantity Qk is defined as

Equation (A.11)

We do not specify the form of A1 and A3, which are not relevant as will be argued below. Now we can use (A.7) and (A.8) to obtain $ p_\theta(x_1,\ldots,x_n|n)$. Let us denote the expected value of a function $X(q_1, q_2)$ with respect to $P(q_1, q_2 |n)$ by $\left\langle X \right\rangle _{q|n}$:

Equation (A.12)

Notice, that if we replace all Qk terms in (A.9) and (A.10) by their mean values $\left\langle Q_k \right\rangle _{q|n}$, we obtain the mean values of coefficients—$\left\langle A_2~\right\rangle _{q|n}$ and $\left\langle A_4~\right\rangle _{q|n}$. Furthermore:

Equation (A.13)

We have just used the fact that odd coefficients $A_1,A_3,A_5,\ldots$ contain only terms proportional to Ql , where l is odd. Moreover, from statistical identity of both sources it follows that $\left\langle Q_l \right\rangle _{q|n} = 0$ for odd l—that's the reason why odd coefficient could have been neglected from the beginning, and only even θ powers are present in (A.13). We are now going to calculate $\mathcal F_{(n)}$ using (3) which takes the form

Equation (A.14)

which simplifies to

Equation (A.15)

After inserting the formulas for $\left\langle A_2~\right\rangle _{q|n}$ and $\left\langle A_4~\right\rangle _{q|n}$, and performing the integration, we obtain

Equation (A.16)

Notice, that now $\theta \ll 1$ condition is not sufficient to ensure that the term with θ4 is negligible compared to the θ2 term, because the powers of n are different in both terms. The approximation

Equation (A.17)

is nevertheless justified provided $\theta^4 n^3~\ll \theta^2 n^2$, which is equivalent to the condition $\theta \ll n^{-1/2}$. The above inequality holds for all n that give relevant contribution to the final result if $\theta \ll \bar{n}^{-1/2}$. Equation (A.6) allows us to compute the one-photon FI:

Equation (A.18)

Note that if emitters do not fluctuate, q1 is always equal to q2, so $\left\langle Q_2~\right\rangle _{q|n} = 0$, and we recover (A.5) up to the 2nd order of θ—using (A.16) one can additionally check that the coefficient at θ4 for non-fluctuating case is also correctly retrieved. The quantity $\left\langle Q_2~\right\rangle _{q|n}$ can be regarded as a measure of fluctuations intensity—it becomes larger, if the normalized difference between q1 and q2 takes large values with high probability. As intuitively expected, the FI per one photon increases with $\left\langle Q_2~\right\rangle _{q|n}$, as well as RGL defined in (5), which in our case has a form

Equation (A.19)

In order to obtain a more specific expression for $\zeta_\textrm{max}$, let us consider the two-level model of emitters mentioned in the main text, i.e.

Equation (A.20)

Equation (A.21)

for i ∈ {1, 2}. Recall that $q_\textrm{off}+q_\textrm{on} = 1$, and the fluctuation strength is defined as $\alpha = 1-q_\textrm{off}/q_\textrm{on}$. It's easy to show that the mean number of photons detected per frame is

Equation (A.22)

The variable Q2 takes a non-zero value only in two equally probable cases when $q_1~\ne q_2$, so

Equation (A.23)

The probability $P(q_1 = q_\textrm{on},q_2 = q_\textrm{off}|n)$ is calculated using (13), and the fact that

Equation (A.24)

The expression present in (A.19) can be written as an infinite sum

Equation (A.25)

which after inserting (A.23) and making some simplifications (e.g. changing variables $q_\textrm{on}, q_\textrm{off}$ to α) takes the form

Equation (A.26)

where

Equation (A.27)

and the following definitions are used: $A = \frac{\alpha}{2-\alpha}$, $B = p^2 (1-A)^2$, C = 2p(1 − p), $D = (1-p)^2(1+A)^2$. Joining together (A.19), (A.22), and (A.26), we obtain the following expression:

Equation (A.28)

where

Equation (A.29)

To obtain the value of $\zeta_\textrm{max}$ for arbitrary parameters, one needs to approximate the infinite sum S numerically. However, it is possible to prove that (see appendix A.4)

Equation (A.30)

which allows us to provide an analytical expression for $\zeta_\textrm{max}$ scaling in the infinitely bright sources regime:

Equation (A.31)

Numerical analysis show, that the replacement of S by C−1 in (A.28), which leads to equation

Equation (A.32)

becomes a good approximation for large enough $\bar{n}$. The comparison between $\zeta_\textrm{max}$ computed numerically for finite $\bar{n}$ using (A.28) and its analytical approximation valid for large $\bar{n}$ (A.32) is shown in figure A1.

A.3. Cumulant based algorithms

From now on the spatial resolution of the camera is not assumed to be infinite, and the whole detection area is divided into $M_\textrm{pix}$ pixels of size Δx. The positions of the centroids of subsequent pixels are denoted by $x_1,\ldots,x_{M_\textrm{pix}}$. The detection time is divided into $M_\textrm{fr}$ frames, mth frame covers the time interval $\left[ (m-1) \tau, m \tau \right]$, where $m \in \{1,2,\ldots,M_\textrm{fr} \}$. We are going to consider only two-level blinking model, and stay with its simplified version with independent frames for a while. Our goal is to compute the FI associated with different choices of vector $\boldsymbol{N}$. Let's consider the case studied in the main text, in which $\boldsymbol{N}$ is described by (6). Subsequent frames are independent, so the central limit theorem can be directly used to prove that $\boldsymbol{N}$ is normally distributed.

One needs to use slightly more subtle arguments to extend the above reasoning to the estimation based on 2nd auto-cumulant only (AC2). $\boldsymbol{N}$ consists of 2nd auto-cumulant (variance) estimators for each pixel

Equation (A.33)

where $ \left\langle n_i \right\rangle = \frac{1}{M_\textrm{fr}}\sum_{m = 1}^{M_\textrm{fr}} n_{i,m}$ denotes the mean value estimator. Unfortunately, this estimator depends on detected photon numbers from different frames, so vectors $\boldsymbol{v}_m$ are not mutually independent anymore. However, in the limit $M_\textrm{fr} \rightarrow \infty$ the mean value estimator becomes very accurate compared to the variability of the number of photons in a given pixel in a single frame because the variance of ni,m does not depend on $M_\textrm{fr}$, and the variance of $\left\langle n_i \right\rangle $ scales as $1/M_\textrm{fr}$. That means, that the replacement of the mean value estimator with its exact value in $\boldsymbol{v}_m$ does not affect the distribution of $\boldsymbol{N}$ in the limit $M_\textrm{fr} \rightarrow \infty$. Vectors $\boldsymbol{v}_m$ become independent after making the described replacement, which allows us to conclude, that $\boldsymbol{N} $ is normally distributed.

In order to compute the FI associated with the normally distributed vector $\boldsymbol{N} $ in the most general case in which both the mean vector $\boldsymbol{\mu}$, and the covariance matrix $\boldsymbol{\Sigma}$ depend on the estimated parameter θ, one can use the formula [35]

Equation (A.34)

If vectors $\boldsymbol{v}_m$ are independent and identically distributed (i.i.d.), the mean vector and the covariance matrix for each $\boldsymbol{v}_m$ are denoted by $\boldsymbol{\mu}_{(1)}$ and $\boldsymbol{\Sigma}_{(1)}$, then $\boldsymbol{\mu} = \boldsymbol{\mu}_{(1)}$, and $ \boldsymbol{\Sigma} = \frac{1}{M_\textrm{fr}} \boldsymbol{\Sigma}_{(1)}$. We therefore see, that the 2nd term in (A.34) is negligible compared to the 1st term in the limit $M_\textrm{fr} \rightarrow \infty$ (provided the first term is non-zero), and hence in this limit we may write

Equation (A.35)

In order to compute the FI per one photon $\mathcal F$ one needs to compute the elements of $\boldsymbol{\mu} $ and $\boldsymbol{\Sigma}$, use (A.35), and then divide $\mathcal F _\textrm{(meas)}$ by the average total photon number. Let $v_{1,m}, v_{2,m},\ldots,v_{n,m}$ be the elements of the vector $\boldsymbol{v}_m$. In some cases we are going to use a short-hand notation $v_{i,1} \equiv v_i, n_{i,1} \equiv n_i$, because the 2nd index can be omitted in many situations when single frame statistics are considered. Then:

Equation (A.36)

Equation (A.37)

In every considered case each vi (i ∈ {1, 2, ..., n}) can be written as a linear combination of elements of the form $ n_j^{k_1} n_l^{k_2}$ where $j,l \in \{1,2,\ldots,M_\textrm{pix} \}$, and $k_1,k_2$ are natural exponents (possibly zero). Therefore, it is enough to be able to compute expected values of products $ \left\langle n_{j_1}^{k_1} n_{j_2}^{k_2} \ldots n_{j_r}^{k_r} \right\rangle $ for $r \leqslant 4$ to reconstruct all terms of $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$. The procedure used to compute these expected values is as follows. PSFs of both sources are numerically integrated over different pixels—we construct variables

Equation (A.38)

where $j\in \{1,2,\ldots,M_\textrm{pix}\}$ denotes the pixel label. Now we use the fact, that the number of photons detected in each pixel, when the sources brightness are fixed, is described by a Poisson distribution with a mean value

Equation (A.39)

where $q_1, q_2~\in \{q_\textrm{off},q_\textrm{on} \}$ denote relative brightness of the 1st and the 2nd emitter respectively. When external, Poissonian noise (studied in section 5.2) is present, the above equation should be modified to

Equation (A.40)

where µB is the mean value of this noise in each pixel in each frame. Further steps remain valid, as conditional random variables $n_j|q_1,q_2$ are Poissonian and mutually independent with and without external noise. The expected value of a product of their powers is

Equation (A.41)

where Mk (v) denotes a kth raw moment of Poisson distribution with a mean value v, e.g. M0(ν) = 1, M1(ν) = ν, $M_2(\nu) = \nu^2 +\nu$, $M_3(\nu) = \nu^3+3~\nu^2+\nu$, $M_4(v) = \nu^4+6~\nu^3 + 7~\nu^2 + \nu$. Equation (A.41) is only valid if indices $j_1,\ldots,j_r$ are mutually different—if any index repeats, one should replace an expression of the form $n_j^{k_1} n_j^{k_2}$ with an expression $n_j^{k_1+k_2}$, repeat such a procedure as long as there are any repetitions left, and only then use (A.41) directly. Already described steps allow us to compute conditional expected values. In order to compute the desired expected values $ \left\langle n_{j_1}^{k_1} n_{j_2}^{k_2} \ldots n_{j_r}^{k_r} \right\rangle $ one only needs to average the conditional ones over four different configurations of the emitters using the formula

Equation (A.42)

By performing the described steps numerically, we obtain $\mathcal F(\theta)$ (and consequently ζ) associated with different image reconstruction algorithms in the simplified blinking model.

We will now show, how to extend this scheme to the case of Markov process based realistic model. Let's first remind, that the relative brightness of each emitter is described by a Markov process with two possible states with different relative brightness: $q_\textrm{on}$ and $q_\textrm{off}$. The brightness of each emitter is a function of time Pi (t) (i ∈ {1, 2}), which takes two possible values: $q_\textrm{off} \bar P$, $q_\textrm{on} \bar P$. During a short time interval $\left[t, t+\delta t\right]$ ith emitter emits Pi (t)δt photons on average. Lifetimes of on- and off- states are equal to $\tau_\textrm{off}$ and $\tau_\textrm{on}$ respectively. The probability that a given emitter remains in a fixed state with a lifetime τi for a time period t is proportional to exp(−t/τi ). Let's now introduce a more formal description of the Markov process, which leads to such an exponential behaviour. At any time t, the state of an emitter is described by a vector $\begin{bmatrix} p_\textrm{off} \\ p_\textrm{on} \end{bmatrix}$, where $p_\textrm{off}$ and $p_\textrm{on}$ denote the probabilities of finding the emitter in off- and on- state respectively. The time evolution of the emitter state is given by

Equation (A.43)

where $ \boldsymbol{T} (\Delta t)$ is a transition matrix defined as

Equation (A.44)

It's easy to check, that after a long evolution the state always converges to

Equation (A.45)

$\begin{bmatrix} \tilde p_\textrm{off} \\ \tilde p_\textrm{on} \end{bmatrix}$ is a stationary state of the process, and will be used as an initial state in our considerations—if no information about the previous run of the process is available, the probability of finding an emitter in a given state is proportional to its lifetime. Using the transition matrix $\boldsymbol{T}(\Delta t)$ elements, let us define the $\boldsymbol{S}(\Delta t)$ matrix:

Equation (A.46)

which allows us to write down formulas for temporal brightness correlations in a compact way:

Equation (A.47)

In the above formula $t_1 \leqslant t_2 \leqslant \cdots \leqslant t_r$, and $\left\langle \bullet \right\rangle $ denotes averaging over Markov processes P1(t), P2(t). Two emitters are independent, so in order to compute a product in which P1 and P2 terms are mixed, one can use the formula

Equation (A.48)

which is true for all functionals $\mathcal G_1$, $\mathcal G_2$. Further on, the following integrals of correlations over detection time frames will be useful:

Equation (A.49)

Equation (A.50)

Equation (A.51)

Equation (A.52)

At this point, we are prepared to attack the problem of computing $\mathcal F$. Although the frames are now correlated, we still consider vectors $\boldsymbol{N}$ that can be written as in (6), and we can use the central limit theorem in its extended version [38] because correlations between frames decay exponentially with time. Therefore, it is again enough to calculate $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ associated with $\boldsymbol{N}$, and then use (A.34). The scaling of both terms in this equation remains the same, so the 2nd term again disappears in the limit $M_\textrm{fr} \rightarrow \infty$. Equation (A.36) is still valid, and can be used to calculate $\boldsymbol{\mu}$, but in order to compute $ \boldsymbol{\Sigma}$ elements one needs to take into account correlations between frames:

Equation (A.53)

In the limit $M_\textrm{fr} \rightarrow \infty$, using the homogeneity of the Markov processes, we can simplify our formula:

Equation (A.54)

Analogously to the previous case, photon numbers in different pixels and time frames are uncorrelated and described by a Poisson distribution if functions $P_1(t), P_2(t)$ are fixed, and we have:

Equation (A.55)

Conditional products of variables nj,m are computed with the help of the formula

Equation (A.56)

valid if pairs $(j_i, m_i)$ mutually differ on at least one position. All terms of $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ are linear combinations of expectation values (where averaging over Markov processes is made) $\left\langle n_{j_1, m_1}^{k_1} n_{j_2, m_2}^{k_2} \ldots n_{j_r, m_r}^{k_r} \right\rangle $. We want restrict ourselves to $ \boldsymbol{N} $ which consist of 2nd order correlations at most, so from now on we assume that $k_1+k_2+ \cdots +k_r \leqslant 4$. Then, after expanding the RHS of (A.56), and using formulas for Poisson distribution moments, we see that all conditional expected values are linear combinations of products of µj,m with at most four terms. Expectation values required to reconstruct $\boldsymbol{\mu}$ and Σ are linear combinations of similar products averaged over Markov processes. Such products can be written with the help of variables χ, defined in (A.49)–(A.52),

Equation (A.57)

Equation (A.58)

Equation (A.59)

Equation (A.60)

Notation $+(1~\leftrightarrow 2)$ means that terms with swapped indices 1 and 2, that correspond to the 1st and 2nd emitter, should be added.

Let's summarize the procedure used to compute $\boldsymbol{\mu}$ and $\boldsymbol{\Sigma}$ for the Markov process based model. First, (A.36) and (A.54) are applied, and all terms are expressed as linear combinations of expected values $\left\langle n_{j_1, m_1}^{k_1} n_{j_2, m_2}^{k_2} \ldots n_{j_r, m_r}^{k_r} \right\rangle $. Then, conditional expected values $\left\langle n_{j_1, m_1}^{k_1} n_{j_2, m_2}^{k_2} \ldots n_{j_r, m_r}^{k_r}|P_1(t), P_2(t) \right\rangle $ are computed with the help of (A.56). Averaging over Markov processes P1(t), P2(t) is made after expanding the RHS of (A.56), formulas ((A.57)–(A.60)) are then utilized to express $\left\langle n_{j_1, m_1}^{k_1} n_{j_2, m_2}^{k_2} \ldots n_{j_r, m_r}^{k_r} \right\rangle $ using variables U1,j , U2,j , and χ (defined in (A.49)–(A.52)). Notice, that the sum present in (A.57) is infinite, but in our case all sums converge. Moreover, it is possible to express all the terms of $\boldsymbol{\mu}$ and $ \boldsymbol{\Sigma}$ in the considered cases with the help of variables $U_{1,j}, U_{2,j}$ (computed numerically), χ1, χ2,1, χ3,1, χ4,1, and the following infinite sums (all of them converge):

Equation (A.61)

Equation (A.62)

Equation (A.63)

Equation (A.64)

where χ variables are expressed as functions of the setup parameters analytically with the help of (A.47), (A.49)–(A.52), and then sums $S_1, S_2, S_3, S_4$ are also computed analytically (they can be expressed as sums of geometric series).

A.4. Proof of (A.30)

In this section we provide a rigorous proof of (A.30). For α = 0, we have A = 0, the value of the sum does not depend on $\bar{n}$ and can be easily computed, and the proof becomes trivial. For $0\lt \alpha \leqslant 1$ we need to prove that

Equation (A.65)

where $0\lt A\leqslant 1$, $B,D \geqslant 0$, $C\gt 0$. Let us define

Equation (A.66)

We are going to use the following property of Poisson distribution:

Equation (A.67)

Intuitively, for large enough $\bar{n}$, all probable values of Poisson distribution are located in the range $ \bar{n} (1~\pm \epsilon)$, because the standard deviation of Poisson distribution with mean $\bar{n}$ is $\sqrt{\bar{n}}$. Let's notice that $X_n(\bar{n} ) \leqslant C^{-1} \frac{\bar{n}^n e^{- \bar{n}}}{n!}$, so the following inequality is true:

Equation (A.68)

which after using (A.67) allows us to conclude that:

Equation (A.69)

Let's now fix $\epsilon , \delta_1 \gt 0$ satisfying

Equation (A.70)

Equation (A.71)

For A = 1 the first inequality may be omitted. It's easy to show, that the described choice of positive constants epsilon and δ1 is always possible. Such a choice allows us to conclude, that for $n \in \left[ \bar{n} (1-\epsilon), \bar{n} (1+\epsilon) \right]$:

Equation (A.72)

Equation (A.73)

Therefore,

Equation (A.74)

and consequently

Equation (A.75)

Since $e^{-\delta_1~\bar{n}} \rightarrow 0$ for $\bar{n} \rightarrow \infty$, and sum $\sum_{n = (1-\epsilon) \bar{n}}^{(1+\epsilon) \bar{n}} \frac{e^{- \bar{n}} \bar{n} ^n}{C n!}$ converges to C−1 (as a consequence of (A.67)), we can use the so called sandwich theorem to finally conclude that

Equation (A.76)

Figure A1.

Figure A1. The comparison between $\zeta (\bar{n})$ computed numerically (black lines), and its analytical approximation (grey lines) for different p and α values is shown in (a). The relative error $\delta G = \frac{G(p,\alpha, \infty) - G(p, \alpha, \bar{n})}{G(p,\alpha, \bar{n})}$ as a function of $\bar{n}$ is sketched in (b).

Standard image High-resolution image
Please wait… references are loading.
10.1088/2040-8986/ac059c