Abstract
The angular synchronization problem of estimating a set of unknown angles from their known noisy pairwise differences arises in various applications. It can be reformulated as an optimization problem on graphs involving the graph Laplacian matrix. We consider a general, weighted version of this problem, where the impact of the noise differs between different pairs of entries and some of the differences are erased completely; this version arises for example in ptychography. We study two common approaches for solving this problem, namely eigenvector relaxation and semidefinite convex relaxation. Although some recovery guarantees are available for both methods, their performance is either unsatisfying or restricted to the unweighted graphs. We close this gap, deriving recovery guarantees for the weighted problem that are completely analogous to the unweighted version.
Similar content being viewed by others
1 Introduction
In this paper we consider the problem of recovering a d dimensional vector of angles \(\varphi \in [0,2\pi )^{d}\) from noisy pairwise differences of its entries \(\varphi _{\ell } - \varphi _{j} + \eta _{\ell ,j} {\text {mod}}2\pi ,\ \ell ,j \in \{1,\ldots ,d\}\), where \(\eta _{\ell ,j}\) denotes noise. This problem is commonly referred to as angular synchronization or phase synchronization. It frequently arises in various applications such as recovery from phaseless measurements [1, 9, 14, 17, 20, 22,23,24], ordering of data from relative ranks [7], digital communications [15], jigsaw puzzle solving [12] and distributed systems [10]. The problem of angular synchronization is also closely related to the broader problem of pose graph optimization [5], which appears in robotics and computer vision and group synchronization problems [19, 21].
Rather than working with the angles \(\varphi _{\ell }\) directly, one typically considers the associated phase factors \({x}_{\ell } := e^{i \varphi _{\ell }},\ \ell \in \{1,\ldots ,d\}\). Hence the vector \({x}=(x_{j})_{j=1}^{d}\) to be recovered belongs to the d-dimensional torus
After this transformation, the pairwise differences \(\varphi _{\ell } - \varphi _{j} {\text {mod}}2\pi \), \(\ell ,j \in \{1,\ldots ,d\}\) take the form of a product
where \(z^{*}\) stands for complex conjugate of the number z and complex conjugate transpose in the case of vectors. The angular synchronization problem has clearly no unique solution as multiplying the vector x by a factor \(e^{i\theta }\) leads to the same product \(x_{\ell } x_{j}^{*}\). Hence we can at best recover x up to a global phase factor, that is, two solutions \(x,x'\in {\mathbb {C}}^{d}\) are to be considered equivalent if \(y=e^{i\theta }x'\) for some \(\theta \in [0,2\pi )\). A natural distance measure between two equivalence classes is given by
A solution to the angular synchronization problem is thus any vector for which this expression vanishes.
In many applications such as certain algorithms for ptychography [9, 14, 22,23,24], noisy observations of only a strict subset of the set of differences are available. To mathematically describe this restriction we will work with the quantity
In these ptychography applications, one also encounters a version of the problem that is generalized in yet another way. Namely, the entries of the vector y to be recovered are not all of modulus 1 (but still assumed to be known). The measurements are still of the form \(y_{j} y_k^{*}\) affected by noise. Clearly this generalized problem can be directly reduced to the angular synchronization problem in its original form by just dividing each measurement by the product of the known magnitudes of the associated entries, but one should note that the noise is also affected by this transformation.
We will now present a short overview of the major developments in angular synchronization. The approaches to the problem mainly split into two dominant branches, which essentially differ by the underlying noise model.
In the first branch, it is assumed that the observed pairwise products of the unknown phase factors are affected by independent Gaussian noise. Typically these results work with \(E=\{(\ell ,j) | j,\ell , \in \{1,\dots , d\}, j\ne \ell \}\), i.e., assuming control of the full set of pairwise differences. That is, the matrix of measurements \(\hat{X}\) is given by
where \(\Delta \) is a \(d \times d\) Hermitian matrix with \(\Delta _{\ell ,\ell } = 0\) and \(\Delta _{\ell ,j},\ \ell > j,\) being independent centered complex Gaussian random variables with unit variance and \(\sigma >0\). This noise model allows to perform maximum likelihood estimation which leads to the least squares problem (LSP)
with weights \(w_{\ell ,j} = 1/\sigma ^{2}\), \(\ell \ne j\) and \(w_{\ell ,\ell } = 0\). Due to the condition \(z \in \mathbb {T}^{d}\), the LSP (3) is NP-hard [29]. Therefore, Singer [27] proposed two possible relaxations, the eigenvector relaxation (ER) and the semidefinite convex relaxation (SDP). Both will be discussed in Sect. 2.
By a closer inspection of the maximum likelihood approach, Bandeira, Boumal and Singer [2] were able to establish an error bound for the solution of the LSP (3) which holds with high probability. In addition the authors gave sufficient conditions on the standard deviation \(\sigma \) under which the SDP recovers the solution of the LSP (3). As an alternative to the relaxation approaches Boumal [4] proposed an iterative approach called generalized power method (GPM) to solve the LSP directly. He showed that the method converges to the minimizer of (3). Later Liu et al. [18] provided additional details about the convergence rate of the GPM. In subsequent work [30], Zhong and Boumal extended the admissible range of \(\sigma \) providing near-optimal error bounds for solutions of the LSP, ER and the GPM and improved the sufficient conditions for the tightness of the SDP relaxation. Another iterative approach to angular synchronization based on cycle consistency and message passing was proposed in [16] and was connected to the iteratively reweighted least squares algorithms in [26].
For the variant of the angular synchronization problem where the vector y to be recovered does not only have entries of modulus one, this theory does not directly apply, as the added Gaussian noise will encounter entrywise rescaling and hence no longer have the same variance for all entries. The least squares approach will, however, have a natural generalization. In analogy to (3) where all differences are multiplied by the inverse of the variance of the i.i.d. noise variables, one weights each difference with the inverse of the variance of the corresponding rescaled noise term, which yields a linear scaling in \(|y_{j} y_{\ell }|\). While this method is not covered by the theory just discussed, it serves as an important motivation for the approach of this paper.
The second branch of development for the angular synchronization problem works with the model that the angular differences rather than the associated phase factors are affected by noise. This version of the problem has also been studied for more general sets E. Consequently, the matrix of measurements \(\hat{X}\) in this model is given by the entries
where \(\eta _{\ell ,j}\) corresponds to the angular noise, or
when the entries to be recovered are not of unit modulus.
Under this model, random noise is somewhat harder to study due to the multiplicative structure. Consequently, most works employ an adversarial noise model making no assumptions on the distribution of the noise. That is, maximum likelihood estimation is no longer applicable. Nevertheless, weighted least squares minimization (3) can still be applied without the statistical justification; and a natural choice for the weights remains \(w_{j,k} = |y_{j} y_k|\). This is in line with the observation that if for two vectors y and \({\tilde{y}}\) the modulus of each entry agrees, then smaller entries play less of a role in determining distance in the sense of (1). Moreover, the expansion
motivates to consider the recovery guarantees for the scaled norms of the form
with \(d \times d\) diagonal scaling matrix S taking the role of the squared magnitudes \(|y_{\ell }|^{2}\). For ptychography applications, inclusion of these weights have also been shown numerically to be beneficial for the overall reconstruction (see Section 4.4 in [24]).
For the multiplicative noise model, several error bounds have been presented in the literature. Iwen et al. [14] worked with the unweighted LSP (3) and established recovery guarantees for the ER based on Cheeger’s inequality [3]. Later in [24], Preskitt developed error bounds for the unweighted case of the LSP. He additionally developed alternative bounds for any selection of weights in the problem (3) and provided sufficient conditions for tightness of the SDP relaxation.
In the literature, the SDP relaxation is studied more often, as under certain conditions it recovers a true solution of the optimization problem (3). On the other hand, it is computationally heavy and above a certain noise level the relaxation is no longer tight, so SDP fails to return the exact solution of the LSP. Thus beyond this threshold, no recovery guarantees for SDP are available. ER, in contrast, is much faster, especially for large dimension d, and its recovery guarantees, where available, are not restricted by tightness assumptions. Before this paper, however, such guarantees were only available for the unweighted scenarios, even though SDP and ER exhibit similar reconstruction accuracy in numerical experiments.
In this paper, we close this gap, providing recovery guarantees for weighted angular synchronization via eigenvector relaxation from measurements of the form (4), following the setup of [14, 24]. In addition, obtained results are generalized to include bounds for reconstruction error with scaled norms (6). We numerically demonstrate that our guarantees even are tighter than the best known guarantees for the unrelaxed problem LSP. Along the way, we also establish improved bounds for LSP.
2 Problem Setup and Previous Results
We study the problem of recovering a vector \(x=(x_{j})_{j=1}^{d}\) with unimodular entries \(x_{j}=e^{i\varphi _{j}}\) from partial and possibly noisy information on the pairwise differences \(x_{\ell } x_{j}^*=e^{i(\varphi _{\ell }-\varphi _{j})}\) for all pairs \((\ell ,j)\) in some set \(E \subset [d]\times [d]\). Here we used the notation \([n]=\{1,\dots , n\}\). As we consider angular noise, the noisy observations will take the form \(e^{i(\varphi _{\ell }-\varphi _{j}+\eta _{\ell ,j})}\), where \(\eta _{\ell ,j}\in (-\pi , \pi ]\) is the angular noise.
The phase factors corresponding to the true pairwise differences will be arranged as a matrix \(X\in {\mathbb {C}}^{d\times d}\), the noisy observation as a matrix \({\hat{X}}\in {\mathbb {C}}^{d\times d}\), that is, the entries of these matrices are given by
By \(\mathcal {H}^{d}\) we denote the space of all \(d \times d\) Hermitian matrices. With \(N \in \mathcal {H}^{d}\) with entries \(N_{\ell ,j}=e^{i\eta _{\ell ,j}}\) denoting the matrix rearrangement of the multiplicative noise, one observes that these two matrix representations are related via \(\hat{X}=X\circ N\), where for two matrices \(A,B\in \mathbb {C}^{d\times d}\), \(A\circ B\) denotes their Hadamard product as defined by \((A\circ B)_{n,m}=A_{n,m}\ B_{n,m}\).
As a measure for the noise level, we will use a Frobenius norm or a spectral norm of the difference \(X-{\hat{X}}\) or its modified versions; recall that for \(A \in \mathbb {C}^{d \times d}\) the Frobenius norm and the spectral norm are given by
The quality of reconstruction will be measured in the Euclidean norm on \(\mathbb {C}^{d}\), given by
For the proofs we will also need the supremum norm \( \left\Vert v \right\Vert _{\infty } := \max _{1\le j \le d} |v_{j}|\).
We will write \(A \succeq 0\) if the matrix A is positive semidefinite, that is
The \({\text {sgn}}\) operator is defined for \(\alpha \in {\mathbb {C}}\) as
This operator is extended to any matrix space \({\mathbb {C}}^{d\times d^\prime }\) by entrywise operation, i.e. for any \(A\in {\mathbb {C}}^{d\times d^\prime }\) we have
Similarly to previous works, our analysis is based on a graph theoretic interpretation. Namely, the matrices X and \({\hat{X}}\) can be seen as edge weight matrix of a weighted undirected graph \(G=(V,E,W)\). Consequently, one has \(|V|=d\), and we can identify V with \([d]=\{1,\dots , d\}\). The set of edges E is naturally identified with the index set of the observed noisy angular differences introduced above. It directly follows from the problem setup that the weight function \(W:V\times V\rightarrow [0,\infty )\) must satisfy symmetry \(W(v,v^\prime )\ge 0,\ W(v,v^\prime )=W(v^\prime ,v)\), and the graph does not allow loops so that \(W(v,v)=0\).
To analyze this graph, we need some basic concepts from graph theory. The adjacency matrix \(A_{G}\) of G is given by
With this notation, one obtains the compact expression \(X= A_{G} \circ xx^{*}\). In case \(W\equiv 1\) on its support, i.e., \(W=A_{G}\), we speak of G as an unweighted graph.
The degree of the vertex \(\ell \) is defined as
and the corresponding degree matrix is the diagonal matrix
The Laplacian of the graph G is given by
Observe that, as the graph is undirected, the Laplacian is symmetric. Moreover, since \(w_{\ell ,j}\ge 0\) we have
for all \(u\in \mathbb {C}^{d}\). Hence the Laplacian is positive semidefinite and therefore has a spectrum consisting of non-negative real numbers, which we denote by \(\lambda _{j}\) with indices j arranged in ascending order, i.e.,
Here the first equality follows from the observation that the vector \(\mathbb {1}=(1,\dots ,1)^T\) satisfies \(L_{G}\mathbb {1}=0\). The spectral gap of G is defined as \(\tau _{G}=\lambda _2\). A graph G is connected if and only if \(\tau _{G}>0 \), see [6]. In that case, the null space of \(L_{G}\) is spanned by \(\mathbb {1}\).
Besides the Laplacian \(L_{G}\) the normalized Laplacian \(L_N\) of G is often used. It is defined as
Its spectrum consists of non-negative real numbers as well and we write \(\tau _N\) for its second smallest eigenvalue \(\lambda _2(L_N)\).
The data dependent Laplacians associated to X and \(\hat{X}\) are defined as
Note that under the multiplicative noise model used in this paper, both these Laplacians are positive semidefinite matrices by Gershgorin’s circle theorem.
The data dependent Laplacian \({\hat{L}}\) corresponding to the noisy observations allows for a compact representation of the least squares problem (3) at the core of our recovery method. Indeed, observe that
Due to the quadratic constraint \(z\in \mathbb {T}^{d}\) the quadratic minimization problem (8) is non-convex and thus NP-hard in general. One way to obtain a feasible problem is to relax the constraint in (8) to \( \left\Vert z \right\Vert _{2}^{2} = d\) and obtain
This is nothing else but the determination of the smallest eigenvalue of the matrix \(\hat{L}\) and can be solved efficiently. We will refer to (9) as eigenvector relaxation (ER).
An error bound for the ER based reconstruction was given by Iwen et. al. in [14] for the case of unweighted graphs. Their proof is based on the Cheeger inequality that is only available for the normalized Laplacian, which is why the minimization problem in their theorem has a different normalization than (9). In the special case that \(\deg (\ell )\) is a constant for all \(\ell \) (as in [14]), the two normalizations agree up to a constant. Using the terminology introduced above their result reads as follows.
Theorem 1
([14], Theorem 3], [24], Theorem 4]) Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E)\) is an undirected connected and unweighted graph with \(\tau _N > 0\). Let \(\tilde{z} \in \mathbb {C}^{d}\) be the minimizer of
and let \(\tilde{x} = {\text {sgn}}(\tilde{z})\). Then,
An alternative approach is based on the idea of lifting the problem to the matrix space. It makes use of the relation
With this the minimization problem (8) transforms into
The class of minimization problems with explicit rank constraints is known to include many NP-hard instances [8], Chapter 2], so a common strategy is to perform a semi-definite relaxation. For (10), the following relaxation has been proposed in [11].
We will refer to this minimization problem as SDP. Note that if Z meets the rank condition in (10) one obtains that \(Z= z z^{*}\), where z is a solution of (8). Without the rank condition, however, the solution to (11) may have higher rank. In this case the methods outputs the phase factors corresponding to the entries of the eigenvector associated to the largest eigenvalue as an approximation for the solution of (8) [28].
As it was mentioned before, the error bound is commonly derived for the solution of the LSP and then applied to the solution of the SDP when it has rank 1. For unweighted graphs, a first result on recovery guarantees of the SDP has been established by Preskitt [24]. Adjusted to our terminology his result reads as follows.
Theorem 2
([24], Theorem 9) Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E)\) is an undirected and unweighted graph with \(\tau _{G}>0\). Let \(\tilde{x} \in \mathbb {T}^{d}\) be the minimizer of the LSP (8). Then,
In general, Theorem 2 exhibits better performance than Theorem 1 and sparks a need for an ER counterpart. For a detailed comparison of these statements, we refer reader to Section 4.3.2 of [24]. The first results addressing a generalization to the important case of weighted graphs have been derived by Preskitt [24]. The following formulations have again been adjusted to our notation.
Theorem 3
[24], Proposition 12 and Theorem 8]) Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E,W)\) weighted graph with \(\tau _{G}>0\). Let \(\tilde{x} \in \mathbb {T}^{d}\) be the minimizer of the LSP (8). Then,
and
As the square root in (3.A) produces slow convergence as the noise diminishes, i.e. \({\hat{X}}\) approaches X, in the many cases bound (3.B) outperforms (3.A). For unweighted graphs, as we numerically explore in Sect. 5, the bound of Inequality (3.B) is similar to that of Theorem 2 and superior to the bound of Theorem 1 for ER in many cases. They are, however, only valid for SDP when the relaxation is tight. The following lemma provides sufficient condition for the tightness of the SDP relaxation.
Lemma 1
([24], Lemma 16]) Suppose \(\tilde{x}\in {\mathbb {T}}^{d}\) is a minimizer of (8) and let \(\tilde{L}=D-W\circ \tilde{x}\ \tilde{x}^{*}\). If
then \(\tilde{x} \tilde{x}^{*}\) is minimizer of (11).
As the spectral gap \(\tau _{G}\) is typically rather small, as compared to the dimension d, tightness is guaranteed only for very small noise levels. In fact, our numerical simulations in Sect. 5 show that the SDP relaxation is indeed not tight in many cases. In contrast, the recovery guarantees for ER provided by Theorem 1 are applicable independently of the tightness of the relaxation, but restricted to the unweighted graphs.
3 Improved Error Bounds
The main contribution of this paper concerns recovery guarantees for weighted angular synchronization via eigenvector relaxation, which are often stronger than even the best known bounds for the unrelaxed problem and do not require any a priori bound for the error to ensure tightness of the relaxation. Along the way, we derive similar error bounds for the solution of the least squares problem, which are exactly analogous to those provided by Theorem 2 in the unweighted case. The superior scaling of our error bounds as compared to Theorem 3 is also confirmed by numerical simulations in Sect. 5. We first state our result in general form, before discussing three special cases of interest.
Theorem 4
Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V ,E,W)\) is a weighted graph with \(\tau _{G}>0\). Let \(\tilde{x} \in \mathbb {T}^{d}\) be the minimizer of the LSP (8) and z be the minimizer of the ER (9). Set \(R\in {\mathbb {C}}^{d\times d}\) as \(R_{\ell ,j}=W_{\ell ,j}^{1/2}\). Then,
and
with tightness penalty \(p_z := \sqrt{2 + 2 \left\Vert z \right\Vert _{\infty }^{2} }\).
The term \( \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}\) is presented in a form of weighted difference of the true and measured pairwise differences. However, it also has an alternative interpretation as a value of the empirical least squares objective evaluated at the vector x, namely
which represents the gap between the value of the noise-free objective (which equals to 0) and the noisy objective at global minimum of the former one. In addition, we note that \( \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}\) can be estimated from above as
The tightness penalty \(p_z\) in the bound varies between 2 when the relaxation is tight and \(\sqrt{2+2d}\) in the worst case. In numerical experiments, however, (see Fig. 3), we do not observe this difference, which suggests that this dimensional factor may be a proof artefact. For the random model (2), this dimension independence has been established in [30], but the proof techniques do not carry over to our setting in a straightforward way, which is why we leave the investigation to future work. For completeness, we note that also Inequality (3.B) carries over to ER, as stated in the following theorem. The proof is directly analogous to the one in [24], which is why we omit the details.
Theorem 5
Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E,W)\) weighted graph with \(\tau _{G}>0\). Let z be the minimizer of the ER (9). Then,
The first consequence of interest concerns the unweighted case, where the noise norm in Theorem 4 simplifies to \( \left\Vert \hat{X} - {X} \right\Vert _{F}\) as in Theorem 2.
For LSP, Theorems 4 and 2 yield the exact same bound, the bounds in Theorem 4 for ER only differ by tightness penalty term \(p_{z}\).
Corollary 1
Suppose \(G = (V,E)\) is an unweighted graph with \(\tau _{G}>0\). Let \(z \in \mathbb {C}^{d}\) be the minimizer of the ER (9). Then
Our next examples are related to the ptychography problem. We remind the reader that the goal of ptychography is to estimate the a signal \(y \in \mathbb {C}^{d}\) from phaseless measurements through localized masks. A recent method for recovering the signal y from such observation is the BlockPR algorithm by Iwen et al. [14], see [9, 22,23,24] for follow-up works developing this algorithm further that also rely on weighted angular synchronization. The BlockPR algorithm proceeds by combining neighboring masks to obtain estimates for the products of entries located close to each other. In mathematical terms, this procedure yields an approximation of the squared absolute values of the entries (so these can be assumed to be approximately known) and a noisy version of \(T_\delta ( y y^{*})\), where \(\delta \) is the size of the mask and \(T_\delta \) is the restriction operator mapping a matrix to its entries indexed by the set
corresponding to the \(2\delta -1\) central sub- and superdiagonals, excluding the entries on the main diagonal. Thus the resulting measurements exactly correspond to (5) for \(E=E_\delta \), which is why weighted angular synchronization is the natural method of choice. The weights in this problem are given by the matrix \(yy^{*}\) restricted to the index set E, which yields the setup of the following corollary. We note that in the next two statements Kroneker’s delta \(\delta _{\ell ,j}\) is 1 when \(\ell =j\) and 0 otherwise.
Corollary 2
Consider a weighted graph \(G=(V,E,W)\) with whose weight matrix W is defined as follows. Let \(y\in {\mathbb {C}}^{d}\) with \({\text {sgn}}(y)=x\). Define matrices \(Y=(I+A_{G})\circ y\ y^*\) and \({X} = {\text {sgn}}({Y})\). Let M and \(\hat{X}\) be the matrices containing the perturbed magnitudes and phases of Y, respectively, so that \(M \approx |{Y}|\), N has unimodular entries and \(\hat{X} = {X} \circ N\) and set \(\hat{Y}= M \circ \hat{X}\). Consider the weight matrix W with entries given by \(w_{\ell ,j}=|\hat{Y}_{\ell ,j}|\ (1-\delta _{\ell ,j})\) and assume that \(\tau _{G}>0\). Let \(\tilde{x}\) be a minimizer of (8) and let z be the minimizer of (9). Then we have
and
In the next statement the set-up is analogous to the previous corollary but instead of having weights defined by \(|\hat{Y}_{\ell ,j}|\) we work with \(|\hat{Y}_{\ell ,j}|^{2}\).
Corollary 3
Consider a weighted graph \(G=(V,E,W)\) whose weight matrix W is defined as follows. Let \(y\in {\mathbb {C}}^{d}\) with \({\text {sgn}}(y)=x\). Define matrices \(Y=(I+A_{G})\circ y\ y^*\) and \({X} = {\text {sgn}}({Y})\). Let M and \(\hat{X}\) be the matrices containing the perturbed magnitudes and phases of Y, respectively, so that \(M \approx |{Y}|\), N has unimodular entries and \(\hat{X} = {X} \circ N\) and set \({\hat{Y}}= M \circ \hat{X}\). Consider the weight matrix W with entries given by \(w_{\ell ,j}=|\hat{Y}_{\ell ,j}|^{2}\ (1-\delta _{\ell ,j})\) and assume that \(\tau _{G}>0\). Let \(\tilde{x}\) be a minimizer of (8) and let z be the minimizer of (9). Then we have
and
The main benefit of Corollary 3 is the absence of the difference \(\hat{X} - {X}\) in the bound. It is especially handy in the ptychographic setup, where estimation of the phase difference error is a complicated task while an upper bound for \( \left\Vert \hat{Y} - {Y} \right\Vert _{F}\) is available.
We note that if the entry \(y_{j}\) is small, it will cause \(w_{\ell ,j}\) and \(w_{j,\ell }\) to be small, as well as \(\deg (j)\). This will result into the node j of the graph G being poorly connected to the rest of the graph and hence the spectral gap \(\tau _{G}\) will be close to zero, making our recovery guarantees useless. A possible cure for such scenarios is an additional preprocessing step, where nodes with small degrees are removed from the graph before the angular synchronization is solved. It would allow to stabilize the spectral gap of the pruned graph and recover phases of the ”strongly” connected nodes. For the discarded nodes, the phase can either be assigned randomly or to some fixed value, as they have small impact on the data fidelity. Unfortunately, the phase error corresponding to such assigned entries cannot be bounded better than by the triangle inequality,
On the other hand, if the scaled norm error is considered, it would naturally reduce the impact of such crude bound by incorporating the magnitude information,
Thus, the recovery guarantees for the scaled norms would be required for the ”strongly” connected part of the graph, which we provide in the next statement.
Theorem 6
Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E,W)\) is a weighted graph with Laplacian \(L_{G}\) and spectral gap \(\tau _{G}>0\). Let S be a \(d \times d\) diagonal matrix with \(S_{\ell ,\ell } >0\). Let \(\tilde{x} \in \mathbb {T}^{d}\) be the minimizer of the LSP (8) and z be the minimizer of the ER (9). Set \(R\in {\mathbb {C}}^{d\times d}\) as \(R_{\ell ,j}=W_{\ell ,j}^{1/2}\). Then,
and
Note that in addition to the penalty \(p_z\), one encounters a normalization factor \(\sqrt{{\text {tr}}S^{2}}/ \left\Vert S z \right\Vert _{2}\), which we suggest may be an artifact of the proof. The condition \(\tau _{G}>0\) is required to control the nullspace of the matrix \(S^{-1} L_{G} S^{-1}\), so we can guarantee that \(\lambda _2( S^{-1} L_{G} S^{-1})>0\).
4 Proofs
Proof of Theorem 4
We will proceed by establishing the following four inequalities.
and
Note that Inequality (21) has been derived in [24], we will nevertheless include a proof for completeness.
Equation (12) then follows by combining (21) and (22), Equation (13) is obtained as a combination of (19) and (20).
It remains to prove the four inequalities. To that extent, we recall that for \(\alpha ,\beta \in \mathbb {C}\) with \(|\beta | = 1\) we have
With help of this inequality we obtain that
Moreover, since \( \left\Vert x \right\Vert _{2}^{2}=d\) and \( \left\Vert z \right\Vert _{2}^{2}=d\) we have that
The right hand side is minimal if \({\text {Re}}\left( {x}^{*} z \right) \) is maximal and equal to \(| {x}^{*} z |\). Hence with \(e^{i \vartheta } := {\text {sgn}}({x}^{*} z)^{*}\) we arrive at
and thus
The projection of \(e^{i\vartheta }z\) onto the orthogonal complement of x is given by
where we used that by (26), the inner product is real. Consequently, as \(q\perp x\), one has that by Pythagoras’ theorem
With the Cauchy-Schwarz inequality and (27), this yields that
Recall that \(q \perp x\) and let us show that x spans the nullspace of the matrix \({L}\). Define the unitary matrix \(C={\text {diag}}\{x_1,\dots , x_d\}\), where \(x=(x_{j})_{j=1}^{d}\). Note that
and by unit modularity of \(x_i\)
where we used commutativity of diagonal matrices. This results in
which shows in particular that the eigenvalues of \({L}\) and \(L_{G}\) coincide.
By assumption \(\tau _{G}>0\) and hence the null space of \(L_{G}\) is spanned by \(\mathbb {1}\). Thus the null space of L is spanned by x.
By definition q is orthogonal to the null space of \({L}\) which implies that
and
Combining this with (24) and the fact that \(\lambda _2({L}) = \tau _{G}\) as well as the definition of L, we obtain both (21) and (19) by
Indeed, (21) follows by comparing the second and the last item in this chain of inequalities, and (19) by comparing the first and the last item.
Now we will prove inequality (20) and (22), again with largely identical proofs.
For simplicity of notation, we introduce the following auxiliary variables
We start by using \((\alpha + \beta )^{2} \le 2\alpha ^{2} + 2 \beta ^{2} \) to get
and we further estimate
For the first sum we observe that
and obtain using (8) and the fact that z minimizes (9) that
For the second sum we use that \(|h_{j}| = |{x}_{j}^{*} z_{j}| = |z_{j}|\) and obtain
The last step is to notice that
Putting everything together we arrive at
This concludes the proof of Inequality (20).
For Inequality (22) we proceed analogously, with \({\tilde{x}}\) taking the role of z; the only difference is that in (29) we are using the fact that \({\tilde{x}}\) minimizes (8) rather than the fact that z minimizes (9). The bound for the second sum is simplified as compared to (20), as we replaced \( \left\Vert z \right\Vert _{\infty }\) by \( \left\Vert {\tilde{x}} \right\Vert _{\infty } = 1\).
The combined bound reads as
\(\square \)
Proof of Theorem 6
Theorem 6 has a similar proof as Theorem 4 and we will only highlight the main differences. The next three inequalities replace Inequalities (19) and (20).
and
Combined, they will grant us inequality (18). The first one is required to transit from the signs of z back to z using Inequality (23). The scaling factor guarantees that vector \(\frac{ \sqrt{{\text {tr}}S^{2}} }{ \left\Vert S z \right\Vert _{2} } Sz\) has same norm as \( \left\Vert S x \right\Vert _{2}\), analogously to \( \left\Vert z \right\Vert _{2}\) having same norm as \( \left\Vert x \right\Vert _{2}\) in proof of Theorem 4, which is crucial for the second inequality. It allows us to transit from reconstruction error to the true objective function. Another difference to the proof of Theorem 4 is the appearance of the matrix \(S^{-1} L_{G} S^{-1}\) instead of \(L_{G}\). This change is important, as it makes the nominator free of the scaling matrix S. The spectral gap of the matrix \(S^{-1} {L} S^{-1}\) is bounded from below by
which implies that the nullspace of \(S^{-1} {L} S^{-1}\) is spanned by Sx. The last inequality introduces a data dependent upper bound for the value of the true objective, which is precisely Inequality (20).
For the LSP bound (17), norm of \(S {\tilde{x}}\) is already \(\sqrt{{\text {tr}}S^{2}}\) and hence, Inequality (30) can be omitted. Moreover, since \( \left\Vert {\tilde{x}} \right\Vert _{\infty } = 1\), the penalty factor \(p_z^{2}\) in Inequality (32) is simplified to a constant 4. This results in
\(\square \)
Proof of Corollary 1
For an unweighted graph G we immediately get
\(\square \)
Proof of Corollary 2
Define an auxiliary weight matrix \(W_0\) by \((W_0)_{\ell ,j}=|Y_{\ell ,j}|\ (1-\delta _{\ell ,j})\). Using inequality (14) we obtain
where in the third line we only increased the number of non-negative summands by adding the diagonal elements \(|\hat{Y}_{\ell ,\ell } - {Y}_{\ell ,\ell }|\), and in the last line we used the inequality \(||\alpha |-|\beta || \le |\alpha - \beta |\). \(\square \)
Proof of Corollary 3
We rewrite the right side of the bound in Theorem 4 as
and apply the inequality (23) to get
\(\square \)
5 Numerical Evaluation
In this section we present a numerical comparison of the error bounds discussed above. Our goal is to illustrate that Theorem 4 indeed provides superior recovery guarantees for an important class of weighted angular synchronization problems, namely those appearing in the context of ptychography, as covered by Corollaries 2 and 3. In particular, we work with the edge set \(E_\delta \) as in (16), for some parameter \(\delta \in [ \lfloor (d+1)/2 \rfloor ] \), which determines the neighborhood of indices for which the pairwise phase differences are known.
In our numerical experiments, we consider measurements affected by angular noise, that is, the measurements are of the form (4), i.e.
with the noise model that \(\eta _{\ell ,j}, (\ell ,j) \in E_\delta \) are independent random variables uniformly distributed on \([-\alpha ,\alpha ]\) for some parameter \(\alpha > 0\) representing the noise level. In the figures, we will denote parameter \(\alpha \) in degrees and not radians.
We consider signals \({y}\) drawn at random with coordinates \({y}_{\ell } = a_{\ell } + i b_{\ell }\), where \(a_{\ell }\) and \(b_{\ell }\) are independent identically distributed standard Gaussian random variables. We assume that the \(|y_{\ell }|\) are known, so the phases of the \({y}_{\ell }\) are our unknown ground truth entries \({x}_{\ell } = e^{i\varphi _{\ell }}\).
In most of the following examples, we fix the dimension to be \(d = 64\) and the parameter \(\delta =16\), so that approximately half of the pairwise phase differences are known. For each point in the figures we generated 30 test signals and plot the average norm of the error. All experiments were performed on the laptop running Windows 10 Pro with an Intel(R) Core(TM) i7-8550U processor, with 16 GB RAM and Matlab R2018b. Our ER implementation is based on the BlockPR software package [13]. For the SDP, we used SeDuMi solver available at [25].
We begin with the comparison of the recovery guarantees for the different weight matrices covered by Corollaries 1, 2, and 3 in terms of the angular noise level \(\alpha \) measured in degrees. To put the bounds into perspective, we include the empirical error of both SDP and ER.
In a view of the fact that the coordinates of \(\tilde{x}\) and \({x}\) have modulus 1 and Inequality (27), a naïve bound for the phase error is given by
Beyond this threshold, the error bounds provided by the statements are non-informative, which is why we indicate the threshold by a dashed black line in the plots.
Figure 1 shows the empirical performance of the SDP and the error bounds for the LSP. While for unweighted graphs Theorems 2 and Inequality (3.B) exhibit comparable behavior, for weighted graphs established Corollaries improve on results of Inequality (3.B). The crucial observation is that the SDP is not tight for weighted graphs even when the noise level is as small as \(10^{-3}\) degrees and only tight for unweighted when noise level is below few degrees. This implies that the error bounds for LSP are no longer applicable for the SDP.
Turning to the ER (Fig. 2), we observe a similar behavior. Recovery guarantees for ER provided by Corollary 1 and Theorem 5 are close in unweighted case and Corollaries 2 and 3 improves on bound provided by Theorem 5 for weighted graphs. As before, for noise above few degrees the supremum norm of the ER solution deviates from 1, what indicates that the relaxation is not tight. In contrast to the SDP, the established recovery guarantees for the ER remain applicable for all noise levels.
Both for unweighted and weighted graphs (Fig. 3), we observe that the empirical error performs similarly for both SDP and ER; there is no significant difference between the two methods in terms of the phase error. For the low and medium noise levels, the phase error rises linearly with the angular noise level. Only for very high noise it exhibits faster growth.
For all scenarios, the guarantees for ER more or less agree with the bounds for the least squares problem. This is remarkable because ER is faster than SDP (see Fig. 5a below). At the time, the empirical errors differ from our error bounds by a factor of roughly \(\delta \), which suggests that an additional tightening of the \(\delta \)-dependence may be possible using refined techniques.
In Fig. 4, we explore the recovery guarantees for scaled norms provided by Theorem 6. In these experiments, the scaling matrix S has magnitudes \(|y_{j}|\) on the main diagonal. Again, the empirical errors and recovery guarantees exhibit similar behavior to non-scaled errors as seen in Fig. 3, except that for high noise levels, recovery guarantees for ER variate from their LSP counterpart as a result of additional normalization factor appearing in Inequality (18).
In terms of the runtime complexity (Fig. 5a), ER exhibits almost linear scaling in the dimension of the problem and clearly outperforms SDP, whose runtime exhibits quadratic scaling. This difference is to be expected as SDP lifts the problem to a \(d \times d\)-dimensional matrix space and thus estimates \(d^{2}\) unknowns instead of d in the case of ER. In fact the large runtime complexity is a crucial bottleneck for SDP relaxations in ptychography, where the dimensions are commonly high.
The last simulation (Fig. 5b) illustrates how both the empirical error and our error bounds depend on the size of the mask in ptychography (which in turn is related to the connectivity of the graph). Again, up to constants, we observe a similar decay pattern for the error between theory and experiment with a fast decay for small \(\delta \) (up to \({\mathcal {O}}(\log d)\)) and slower decay for larger values of \(\delta \).
6 Discussion and Future Work
The main focus of this paper is the eigenvector relaxation of the angular synchronization problem. We derived new flexible error bounds for this method. Along the way, we established new recovery guarantees for the solution of the weighted least squares problem (3). Our numerical evaluation shows that the recovery guarantees we obtain are tighter than other results in the literature. As compared to semidefinite programming, eigenvector relaxation shows similar performance in the terms of empirical error and has significantly shorter runtime; at the same time our recovery guarantees are not subject to additional constraint corresponding to the tightness of the relaxation as they appear for the semidefinite programming.
Our numerical experiments are based on the simple random angular noise model, which likely does not correspond to the noise arising in applications. In Fig. 6, we observe that while for the simplified noise model, unweighted angular synchronization seems most appropriate, for Gaussian noise applied directly to the phaseless measurements [14], weighted angular synchronization is the method of choice.
Another interesting direction of future work is to extend the generalized power method developed by Boumal in [4] to arbitrary sets E. Our current results can be considered as a first step, since the generalized power method uses the solution of the eigenvector relaxation problem as initialization, so good bounds on the error are crucial for estimating the quality of the initialization.
References
Alexeev, B., Bandeira, A.S., Fickus, M., Mixon, D.G.: Phase retrieval with polarization. SIAM J. Imaging Sci. 7(1), 35–66 (2014). https://doi.org/10.1137/12089939X
Bandeira, A.S., Boumal, N., Singer, A.: Tightness of the maximum likelihood semidefinite relaxation for angular synchronization. Math. Program. 163(1–2), 145–167 (2017). https://doi.org/10.1007/s10107-016-1059-6
Bandeira, A.S., Singer, A., Spielman, D.A.: A cheeger inequality for the graph connection laplacian. SIAM J. Matrix Anal. Appl. 34(4), 1611–1630 (2013). https://doi.org/10.1137/120875338
Boumal, N.: Nonconvex phase synchronization. SIAM J. Optim. 26(4), 2355–2377 (2016). https://doi.org/10.1137/16M105808X
Carlone, L., Calafiore, G.C., Tommolillo, C., Dellaert, F.: Planar pose graph optimization: duality, optimal solutions, and verification. IEEE Trans. Robot. 32(3), 545–565 (2016). https://doi.org/10.1109/TRO.2016.2544304
Chung, F.R.K.: Spectral Graph Theory. American Mathematical Society, Providence, RI (1997)
Cucuringu, M.: Sync-rank: Robust ranking, constrained ranking and rank aggregation via eigenvector and sdp synchronization. IEEE Trans. Netw. Sci. Eng. 3(1), 58–79 (2016). https://doi.org/10.1109/TNSE.2016.2523761
Fazel, M.: Matrix rank minimization with applications. Ph.D. thesis, Stanford University (March 2002). https://faculty.washington.edu/mfazel/thesis-final.pdf
Forstner, A., Krahmer, F., Melnyk, O., Sissouno, N.: Well-conditioned ptychographic imaging via lost subspace completion. Inverse Prob. 36(10), 105009 (2020). https://doi.org/10.1088/1361-6420/abaf3a
Giridhar, A., Kumar, P.R.: Distributed clock synchronization over wireless networks: Algorithms and analysis. In: Decision and Control, 2006 45th IEEE Conference on, pp. 4915–4920. [publisher not identified], [Place of publication not identified] (2006). https://doi.org/10.1109/CDC.2006.377325
Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 42(6), 1115–1145 (1995). https://doi.org/10.1145/227683.227684
Huroyan, V., Lerman, G., Wu, H.T.: Solving jigsaw puzzles by the graph connection laplacian. SIAM J. Imaging Sci. 13(4), 1717–1753 (2020). https://doi.org/10.1137/19M1290760
Iwen, M., Wang, Y., Viswanathan, A.: Blockpr: Matlab software for phase retrieval using block circulant. https://bitbucket.org/charms/blockpr/src/master/
Iwen, M.A., Preskitt, B., Saab, R., Viswanathan, A.: Phase retrieval from local measurements: Improved robustness via eigenvector-based angular synchronization. Appl. Comput. Harmon. Anal. 48(1), 415–444 (2020). https://doi.org/10.1016/j.acha.2018.06.004
Kisialiou, M., Luo, Z.Q.: Probabilistic analysis of semidefinite relaxation for binary quadratic minimization. SIAM J. Optim. 20(4), 1906–1922 (2010). https://doi.org/10.1137/08072320X
Lerman, G., Shi, Y.: Robust group synchronization via cycle-edge message passing. https://arxiv.org/pdf/1912.11347
Li, L., Cheng, C., Han, D., Sun, Q., Shi, G.: Phase retrieval from multiple-window short-time fourier measurements. IEEE Signal Process. Lett. 24(4), 372–376 (2017). https://doi.org/10.1109/LSP.2017.2663668
Liu, H., Yue, M.C., Man-Cho So, A.: On the estimation performance and convergence rate of the generalized power method for phase synchronization. SIAM J. Optim. 27(4), 2426–2446 (2017). https://doi.org/10.1137/16M110109X
Liu, H., Yue, M.C., So, A.M.C.: A unified approach to synchronization problems over subgroups of the orthogonal group. https://arxiv.org/pdf/2009.07514
Marchesini, S., Tu, Y.C., Wu, H.T.: Alternating projection, ptychographic imaging and phase synchronization. Appl. Comput. Harmon. Anal. 41(3), 815–851 (2016). https://doi.org/10.1016/j.acha.2015.06.005
Maunu, T., Lerman, G.: Depth descent synchronization in \({\rm SO} (d)\). https://arxiv.org/pdf/2002.05299
Melnyk, O., Filbir, F., Krahmer, F.: Phase retrieval from local correlation measurements with fixed shift length. In: 2019 13th International conference on Sampling Theory and Applications (SampTA), pp. 1–5. IEEE (7/8/2019 - 7/12/2019). https://doi.org/10.1109/SampTA45681.2019.9030967
Perlmutter, M., Merhi, S., Viswanathan, A., Iwen, M.: Inverting spectrogram measurements via aliased wigner distribution deconvolution and angular synchronization. Information and Inference: A Journal of the IMA (2020). https://doi.org/10.1093/imaiai/iaaa023
Preskitt, B.: Phase retrieval from locally supported measurements. Ph.D. thesis, UC San Diego (2018). https://escholarship.org/uc/item/97v5k8j9
Sedumi package. http://sedumi.ie.lehigh.edu/?page_id=58
Shi, Y., Lerman, G.: Message passing least squares framework and its application to rotation synchronization. ICML 2020 https://arxiv.org/pdf/2007.13638.pdf
Singer, A.: Angular synchronization by eigenvectors and semidefinite programming. Appl. Comput. Harmon. Anal. 30(1), 20–36 (2011). https://doi.org/10.1016/j.acha.2010.02.001
So, A.M.C., Zhang, J., Ye, Y.: On approximating complex quadratic optimization problems via semidefinite programming relaxations. Math. Program. 110(1), 93–110 (2007). https://doi.org/10.1007/s10107-006-0064-6
Zhang, S., Huang, Y.: Complex quadratic optimization and semidefinite programming. SIAM J. Optim. 16(3), 871–890 (2006). https://doi.org/10.1137/04061341X
Zhong, Y., Boumal, N.: Near-optimal bounds for phase synchronization. SIAM J. Optim. 28(2), 989–1016 (2018). https://doi.org/10.1137/17M1122025
Acknowledgements
Open Access funding enabled and organized by Projekt DEAL.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
OM and FF were partially supported by the Helmholtz Association within the projects Ptychography 4.0 and EDARTI. FK acknowledges support by the German Science Foundation DFG in the context of an Emmy Noether junior research group (project KR 4512/1-1).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Filbir, F., Krahmer, F. & Melnyk, O. On Recovery Guarantees for Angular Synchronization. J Fourier Anal Appl 27, 31 (2021). https://doi.org/10.1007/s00041-021-09834-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00041-021-09834-1