Error analysis for denoising smooth modulo signals on a graph

https://doi.org/10.1016/j.acha.2021.11.005Get rights and content

Abstract

In many applications, we are given access to noisy modulo samples of a smooth function with the goal being to robustly unwrap the samples, i.e., to estimate the original samples of the function. In a recent work, Cucuringu and Tyagi [11] proposed denoising the modulo samples by first representing them on the unit complex circle and then solving a smoothness regularized least squares problem – the smoothness measured w.r.t. the Laplacian of a suitable proximity graph G – on the product manifold of unit circles. This problem is a quadratically constrained quadratic program (QCQP) which is nonconvex, hence they proposed solving its sphere-relaxation leading to a trust region subproblem (TRS). In terms of theoretical guarantees, 2 error bounds were derived for (TRS). These bounds are however weak in general and do not really demonstrate the denoising performed by (TRS).

In this work, we analyse the (TRS) as well as an unconstrained relaxation of (QCQP). For both these estimators we provide a refined analysis in the setting of Gaussian noise and derive noise regimes where they provably denoise the modulo observations w.r.t. the 2 norm. The analysis is performed in a general setting where G is any connected graph.

Introduction

Many modern applications involve the acquisition of noisy modulo samples of a function f:RdR, i.e., we obtainyi=(f(xi)+ηi)modζ;i=1,,n for some ζR+, where ηi denotes noise. Here, amodζ lies in the interval [0,ζ) and is such that a=qζ+(amodζ) for an integer q. This situation arises, for instance, in self-reset analog to digital converters (ADCs) which handle voltage surges by simply storing the modulo value of the voltage signal [17], [26], [35]. In other words, if the voltage signal exceeds the range [0,ζ], then its value is simply reset via the modulo operation. Another important application is phase unwrapping where one typically obtains noisy modulo 2π samples. There, one usually seeks to infer the structure of an object by transmitting waveforms, and measuring the difference in phase (in radians) between the transmitted and scattered waveforms. This is common in synthetic radar aperture interferometry (InSAR) where one aims to learn the elevation map of a terrain (see for e.g. [13], [36]), and also arises in many other domains such as MRI [15], [20] and diffraction tomography [25], to name a few.

Let us assume ζ=1 from now. Given (1.1), one can ask whether we can recover the original samples of f, i.e., f(xi)? Clearly, this is only possible up to a global integer shift. Furthermore, answering this question requires making additional assumptions about f, for instance, we may assume f to be smooth (for e.g., Lipschitz, continuously differentiable etc.). In this setting, when d=1, it is not difficult to see that we can exactly recover the samples of f when there is no noise (i.e., ηi=0 in (1.1)) provided the sampling is fine enough. This is achieved by sequentially traversing the samples and reconstructing the quotient qiZ corresponding to f(xi) by setting it to qi1+ai1,i where ai1,i equals either (a) 0 (if yi is “close enough” to yi1), or (b) ±1 (if yi is sufficiently smaller/larger than yi1). If f is smooth, then a fine enough sampling density will ensure that nearby samples of f have quotients which differ by either 1,1 or 0. This argument can be extended to the general multivariate setting as well (see [12]).

While exact recovery of f(xi) is of course no longer possible in the presence of noise, one can show [12] that if ηimod1 is not too large (uniformly for all i), and if n is sufficiently large, then the estimates f˜(xi) returned by the above procedure are such that (up to a global integer shift) for each i, |f˜(xi)f(xi)| is bounded by a term proportional to the noise level. This suggests a natural two stage procedure for estimating f(xi) – first obtain denoised estimates of f(xi)mod1, and in the second stage, “unwrap” these denoised modulo samples to obtain the estimates f˜(xi).

This was the motivation behind the recent work of Cucuringu and Tyagi [10], [11] that focused primarily on the first (modulo denoising) stage, which is an interesting question by itself. Before discussing their result, it will be convenient to fix the notation used throughout the paper.

Notation. Vectors and matrices are denoted by lower and upper case symbols respectively, while sets are denoted by calligraphic symbols (e.g., N), with the exception of [n]={1,,n} for nN. The imaginary unit is denoted by ι=1. For a,b0, we write ab when there exists an absolute constant C>0 such that aCb. If ab and ab, then we write ab. For uCn, we write u=Re(u)+ιIm(u) where Re(u),Im(u)Rn denote its real and imaginary parts respectively. The Hermitian conjugate of u is denoted by u, and up denotes the usual p norm in Cn for 1p.

Cucuringu and Tyagi [11], [10] considered the problem of denoising modulo samples by formulating it as an optimization problem on a manifold. Assume f:[0,1]dR, and suppose that xi's form a uniform grid in [0,1]d. The main idea is to represent each sample yi via zi=exp(ι2πyi) and observing that if xi is close to xj, then exp(ι2πf(xi))exp(ι2πf(xj)) holds provided f is smooth. In other words, the samples z=(z1,,zn) lie on the manifoldCn:={uCn:|ui|=1;i=1,,n} which is the product manifold of unit radius circles, i.e., Cn=C1××C1. Hence, by constructing a proximity graph G=([n],E) on the sampling points (xi)i=1n – where there is an edge between i and j if xi is within a specified distance of xj – they proposed solving a quadratically constrained quadratic program (QCQP)mingCngz22+γgLg where L is the Laplacian of G, and γ>0 is a regularization parameter. While the objective function is convex, this is a non-convex problem due to the constraint set Cn. It is not clear whether the global solution of (QCQP) can be obtained in polynomial time. Hence they proposed relaxing Cn to a sphere leading to the trust region subproblem (TRS)mingCn:g22=ngz22+γgLgmingCn:g22=n2Re(gz)+γgLg.

Trust region subproblems are well known to be solvable in polynomial time, with many efficient solvers (e.g., [14], [1]). In [11], [10], (TRS) was shown to be quite robust to noise and scalable to large problem sizes through extensive experiments. On the theoretical front, a 2 stability result was shown for the solution gˆ of (TRS). To describe this result, denote hCn to be the ground truth samples where hi=exp(ι2πf(xi)), and define the entry-wise projection operator CnCn [21](g|g|)i:={gi|gi|;gi0,1; otherwise, i=1,,n. Consider the univariate case with f:[0,1]R, and assume that the (Gaussian) noise ηiN(0,σ2) i.i.d. for i=1,,n. Assuming f is M-Lipschitz, and σ1, denote Δ to be the maximum degree of G with xi's forming a uniform grid on [0,1]. Then provided γΔ1, the bounds in [11, Theorem 14] imply1 that with high probability, the solution gˆ of (TRS) satisfiesgˆ|gˆ|h22σn+γM2Δ3n. On the other hand, one can show that zh22σ2n with high probability if σ1, hence we cannot concludegˆ|gˆ|h2zh2. This motivates the present paper which seeks to identify conditions under which (1.4) holds. In fact, we will consider a more abstract problem setting where G is any connected graph, and hCn is smooth w.r.t. G. This is formally described below, followed by a summary of our main results.

Consider an undirected, connected graph G=([n],E) where E{{i,j}:ij[n]} denotes its set of edges. Denote the maximum degree of G by △, and the (combinatorial) Laplacian matrix associated with G by LRn×n. We will denote the eigenvalues of L byλn=0<λn1=λminλn2λ1 where λmin denotes the Fiedler value of G, and is well known to be a measure of connectivity of G. The corresponding (unit 2 norm) eigenvectors of L will be denoted by qjRn; j=1,,n where we have that qn=1n(1,,1)T.

Let h=[h1,,hn]TCn be an unknown ground truth signal which is assumed to be smooth w.r.t. G in the sense thathLh={i,j}E|hihj|2Bn. Here, Bn will typically be “small” with the subscript n depicting possible dependence on n. We will assume that information about h is available in the form of noisy zCn. Specifically, denoting ηiN(0,σ2); i=1,,n to be i.i.d. Gaussian random variables, we are givenzi=hiexp(ι2πηi);i=1,,n. Given access to the noisy samples zCn, and the graph G, we aim to answer the following two questions.

  • 1.

    Consider the unconstrained quadratic program (UCQP) obtained by relaxing the constraints in (QCQP) to CnmingCngz22+γgLg. This is a convex program, also referred to as Tikhonov regularization in the literature (e.g., [31]), and its solution is given in closed form by gˆ=(I+γL)1z. Under what conditions can we ensure that gˆ satisfies (1.4)?

  • 2.

    Under what conditions can we ensure that the solution gˆ of (TRS) satisfies (1.4)?

As we will see in Section 4, the solution of (TRS) is closely related to that of (UCQP) but requires a more careful analysis.

Remark 1

It is worth mentioning that the quadratic penalty term γgLg in (UCQP) and (TRS) aims to promote solutions gˆ which are smooth w.r.t G. This is natural given that the ground truth signal hCn is assumed to be smooth w.r.t. G, as stated in (1.5). Moreover, the choice of the regularization parameter γ is important since larger values of γ increase the smoothness of the estimate (larger bias), while smaller values increase the variance of the estimate. For e.g., if we set γ=0, which is equivalent to removing the regularization term, then we obtain gˆ=z. The main point of the ensuing analysis is to find a suitable “intermediate” choice of γ which ensures that gˆ satisfies (1.4).

Example: denoising modulo samples of a function. We will also seek to instantiate the above results for the setup in [11] described earlier. More precisely, denoting f:[0,1]R to be a M-Lipschitz function where |f(x)f(y)|M|xy| for each x,y[0,1], let xi=i1n1 for i=1,,n be a uniform grid. For each i, we are givenyi=(f(xi)+ηi)mod1;ηiN(0,σ2), where ηi are i.i.d. Then denoting zi=exp(ι2πyi) and hi=exp(ι2πf(xi)), we will consider G=([n],E) to be the path graph (Pn), whereE={{i,i+1}:i=1,,n}.

Before stating our results, let us define for any λ[λmin,λ1], the setLλ:={j[n1]:λj<λ} consisting of indices corresponding to the “low frequency” part of the spectrum of L. Moreover, while all our results are non-asymptotic, we suppress constants in this section for clarity. The reader is referred to Appendix A for a tabular summary of the main notation used in the paper.

Results for (UCQP). We begin by outlining our main results for the estimator (UCQP). Theorem 1 below identifies conditions on σ under which (UCQP) provably denoises the samples in expectation. The statement is a simplified version of Theorem 4.

Theorem 1

For any given ε(0,1) and λ[λmin,λ1] satisfying 1+|Lλ|εn, suppose that 1ελBnnσ1. Then for the choice γ(σ2nBnλ2)1/4, the solution gˆ of (UCQP) satisfies Egˆ|gˆ|h22εEzh22.

The parameter λ depends on the spectrum of the Laplacian and can be thought of as the “cut-off frequency”. As can be seen from Theorem 1, we would ideally like λ to be “large” and |Lλ| to be “small”; in particular, |Lλ|=o(n). For e.g., when G is the complete graph2 (Kn), then λn=0 and λi=n for i=1,,n1. Hence, the choice λ=n is ideal as it leads to |Lλ|=0. Furthermore, the noise regime in the Theorem involves a lower bound on σ which might seem unnatural. We believe this is likely an artefact of the analysis involving the estimation error. Specifically, the error bound we obtain is of the form (see Lemma 3.1)Egˆ|gˆ|h22σλBnn+σ2(1+|Lλ|). The problematic term is the first term which depends linearly on σ. Indeed, since zh22σ2n w.h.p., we end up with the lower bound requirement when σ1 for ensuring the denoising guarantee. Nevertheless, if 1ελBnn=o(1) as n increases, the requirement on σ is of the form o(1)σ1 which becomes progressively mild as n increases.

We also derive conditions under which denoising occurs with high probability3 (w.h.p.). Theorem 2 below is a simplified version of Theorem 6.

Theorem 2

For any given ε(0,1) and λ[λmin,λ1], suppose thatmax{1ελBnn,lognεn}σ1and1+|Lλ|+(1+|Lλ|)lognεn. If γ(σ2nBnλ2)1/4, then w.h.p., the solution gˆ of (UCQP) satisfies gˆ|gˆ|h22εzh22.

The conditions in Theorem 2 are slightly more stringent compared to those in Theorem 1 due to the appearance of extra log factors. These arise due to the concentration inequalities used in the analysis for bounding the estimation error, see Theorem 5.

In Section 3, we interpret our results for special graphs4 such as Kn,Pn and the star graph5 (Sn); see Corollary 2, Corollary 3. In particular, the statement for Pn therein can be applied to the example from Section 1.2, readily yielding conditions that ensure denoising; see Corollary 4. It states that if lognnσ1 and n1, then for γ(σ2n10/3M2)1/4 the solution gˆ of (UCQP) satisfies (w.h.p.)gˆ|gˆ|h22(σM+σ2)n2/3+logn. Hence for any ε(0,1), in the noise regime max{Mεn1/3,lognεn}σ1, if n(1/ε)3, then (UCQP) denoises z w.h.p.

Results for (TRS). We now describe conditions under which the estimator (TRS) provably denoises zCn w.h.p. Theorem 3 below is a simplified version of Theorem 8.

Theorem 3

For any ε(0,1), given k[n1] s.t. λnk+1<λnk and λ[λmin,λ1] with the choice γ(σ2nBnλ2)1/4, suppose that the following conditions are satisfied.

  • (i)

    Bnmin{nλnk,nλ}, and 1+|Lλ|+(1+|Lλ|)lognεn.

  • (ii)

    σmin{ε,λλnk+12Bnn} andσmax{(lognn)1/2,Bnnλnkε,lognnε,1ελ(Bnn+λnk+12nBn)}.

Then the (unique) solution gˆCn of (TRS) satisfies gˆ|gˆ|h22εzh22 w.h.p.

The above statement is admittedly more convoluted than that for (UCQP) which is primarily due to the more intricate nature of the estimation error analysis. An interesting aspect of the above result is that we can consider the “best” choice of k (satisfying λnk+1<λnk) which leads to the mildest constraints on σ and Bn. Note that since G is connected (by assumption), k=1 always satisfies this condition (0=λn<λn1). This leads to the following useful Corollary which is a simplified version of Corollary 6.

Corollary 1

For any ε(0,1) and λ[λmin,λ1] with the choice γ(σ2nBnλ2)1/4, suppose that the following conditions are satisfied.

  • (i)

    Bnnλmin and 1+|Lλ|+(1+|Lλ|)lognεn.

  • (ii)

    max{(lognn)1/2,Bnnλminε,lognn,1ελBnn}σε.

Then the (unique) solution gˆCn of (TRS) satisfies gˆ|gˆ|h22εzh22 w.h.p.

It is natural to ask whether Corollary 1 is – for all practical purposes – sufficient, compared to Theorem 3. For the complete graph Kn, observe that the only valid choice is k=1, hence we need Corollary 1. However, as we will see in Section 4, it turns out that for the path graph Pn, Corollary 1 leads to unreasonably strict conditions on σ and Bn (see Remark 5). In fact for the path graph, λnk+1<λnk holds for each k[n1], hence a careful choice of k in Theorem 3 is key to obtaining satisfactory conditions, see Corollary 8.

In the setting of the example from Section 1.2, Corollary 8 roughly states that for any ε(0,1), in the noise regime (lognn)1/2σε with n large enough and a suitably chosen γ, (TRS) denoises z w.h.p. This condition on σ is visibly stricter than that for (UCQP); we believe that this is likely an artefact of the analysis. Nevertheless, for this example, we see for ε fixed and n that both (TRS) and (UCQP) provably denoise z w.h.p. in the noise regime o(1)σ1.

Outline of the paper. The rest of the paper is organized as follows. Section 2 introduces some preliminaries involving certain intermediate facts and technical results that will be needed in our analysis. Section 3 contains the analysis for (UCQP) while Section 4 analyzes (TRS). Section 5 contains some simulation results on synthetic examples for (UCQP) and (TRS). We conclude with Section 6 which contains a discussion with related work from the literature, and directions for future work.

Section snippets

Preliminaries

This section summarizes some useful technical tools that will be employed at multiple points in our analysis. We begin by deriving some simple consequences of the smoothness assumption in (1.5).

Smooth phase signals. Similar to (1.8), for λ[λmin,λ1], let us define the setHλ:={j[n1]:λjλ} consisting of indices corresponding to the “high frequency” part of the spectrum of L. Then using (1.5) and the fact h22=n, it is not difficult to establish thatjHλ|h,qj|2Bnλ,jLλ{n}|h,qj|2nBnλ.

Error bounds for (UCQP)

We now analyse the quality of the solution gˆ=(I+γL)1z of (UCQP). To begin with, we have the following Lemma that bounds the error between gˆ|gˆ| and h for any zCn.

Lemma 1

For any zCn and λ[λmin,λ1], it holds thatgˆ|gˆ|h228(E1+E2) whereE1=|qn,e2π2σ2zh|2+1(1+γλmin)2jLλ|qj,e2π2σ2zh|2+e2π2σ2zh22(1+γλ)2,E2=2γ2Bn(1+γλmin)2.

Proof

From the definition in (1.2), observe that gˆ|gˆ|=e2π2σ2gˆ|e2π2σ2gˆ| where we recall gˆ=(I+γL)1z. We can writee2π2σ2gˆh=(I+γL)1(e2π2σ2zh)e1+(I+γL)1hhe2

Error bounds for (TRS)

We now proceed to derive a 2 error bound for the solution gˆ of (TRS). Following the notation in [11], we can represent any xCn via x¯R2n, where x¯=[Re(x)TIm(x)T]T. Moreover, consider the matrixH=(γL00γL)=γ(1001)LR2n×2n where ⊗ denotes the Kronecker product. Then it is easy to check that (TRS) is equivalent tomingR2n:g22=ngTHg2gTz where gˆ is a solution of (TRS) iff gˆ¯=[Re(gˆ)TIm(gˆ)T]T is a solution of (TRSR).

The following Lemma from [11], which in turn is a direct consequence

Simulations

We now provide simulation results on some synthetic examples. For concreteness, we consider the following functions.

  • 1.

    f1(x)=3xcos2(2πx)sin2(2πx)+0.7,

  • 2.

    f2(x)=sin(2πx).

The function f1(x)mod1 is relatively more complicated than f2(x)mod1 as the former has more number of “folds” or “jumps” than the latter. Following the notation and setup in the example described in Section 1.2, we sample the functions on a uniform grid in [0,1] (containing n=500 points) according to the Gaussian noise model in

Discussion

We now discuss in detail some related work and conclude with directions for future work.

Acknowledgements

I would like to thank Stéphane Chrétien and Michaël Fanuel for carefully reading a preliminary version of the draft, and for providing useful feedback; Alain Celisse for the very helpful technical discussions during the early stages of this work.

References (36)

  • D.P. Towers et al.

    Automatic interferogram analysis techniques applied to quasi-heterodyne holography and ESPI

    Opt. Lasers Eng.

    (1991)
  • S. Adachi et al.

    Solving the trust-region subproblem by a generalized eigenvalue problem

    SIAM J. Optim.

    (2017)
  • M. Belkin et al.

    Regularization and semi-supervised learning on large graphs

  • P.C. Bellec

    Concentration of Quadratic Forms Under a Bernstein Moment Assumption

    (2019)
  • A. Bhandari et al.

    On unlimited sampling

  • A. Bhandari et al.

    Unlimited sampling of sparse signals

  • A. Bhandari et al.

    Unlimited sampling of sparse sinusoidal mixtures

  • Stéphane Boucheron et al.

    Concentration Inequalities: A Non Asymptotic Theory of Independence

    (2013)
  • A.E. Brouwer et al.

    Spectra of Graphs

    (2012)
  • S. Chavez et al.

    Understanding phase maps in MRI: a new cutline phase unwrapping method

    IEEE Trans. Med. Imaging

    (2002)
  • M. Cucuringu et al.

    On denoising modulo 1 samples of a function

  • M. Cucuringu et al.

    Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping

    J. Mach. Learn. Res.

    (2020)
  • M. Fanuel et al.

    Denoising modulo samples: k-NN regression and tightness of SDP relaxation

    Inf. Inference: J. IMA

    (2021)
  • L.C. Graham

    Synthetic interferometer radar for topographic mapping

    Proc. IEEE

    (1974)
  • W.W. Hager

    Minimizing a quadratic over a sphere

    SIAM J. Optim.

    (2001)
  • M. Hedley et al.

    A new two-dimensional phase unwrapping algorithm for MRI images

    Magn. Reson. Med.

    (1992)
  • H.Y.H. Huang et al.

    Path-independent phase unwrapping using phase gradient and total-variation (TV) denoising

    Opt. Express

    (2012)
  • W. Kester

    Mt-025 tutorial ADC architectures VI: Folding ADCs. Analog Devices

    (2009)
  • Cited by (3)

    View full text