1 Introduction

In this paper we consider the problem of recovering a d dimensional vector of angles \(\varphi \in [0,2\pi )^{d}\) from noisy pairwise differences of its entries \(\varphi _{\ell } - \varphi _{j} + \eta _{\ell ,j} {\text {mod}}2\pi ,\ \ell ,j \in \{1,\ldots ,d\}\), where \(\eta _{\ell ,j}\) denotes noise. This problem is commonly referred to as angular synchronization or phase synchronization. It frequently arises in various applications such as recovery from phaseless measurements [1, 9, 14, 17, 20, 22,23,24], ordering of data from relative ranks [7], digital communications [15], jigsaw puzzle solving [12] and distributed systems [10]. The problem of angular synchronization is also closely related to the broader problem of pose graph optimization [5], which appears in robotics and computer vision and group synchronization problems [19, 21].

Rather than working with the angles \(\varphi _{\ell }\) directly, one typically considers the associated phase factors \({x}_{\ell } := e^{i \varphi _{\ell }},\ \ell \in \{1,\ldots ,d\}\). Hence the vector \({x}=(x_{j})_{j=1}^{d}\) to be recovered belongs to the d-dimensional torus

$$\begin{aligned} \mathbb {T}^{d} := \{ v \in \mathbb {C}^{d} ~|~ |v_1| = \ldots = |v_d| = 1 \}. \end{aligned}$$

After this transformation, the pairwise differences \(\varphi _{\ell } - \varphi _{j} {\text {mod}}2\pi \), \(\ell ,j \in \{1,\ldots ,d\}\) take the form of a product

$$\begin{aligned} e^{i (\varphi _{\ell } - \varphi _{j})} = e^{i \varphi _{\ell }} \cdot e^{- i \varphi _{j}} = {x}_{\ell } {x}_{j}^{*}, \end{aligned}$$

where \(z^{*}\) stands for complex conjugate of the number z and complex conjugate transpose in the case of vectors. The angular synchronization problem has clearly no unique solution as multiplying the vector x by a factor \(e^{i\theta }\) leads to the same product \(x_{\ell } x_{j}^{*}\). Hence we can at best recover x up to a global phase factor, that is, two solutions \(x,x'\in {\mathbb {C}}^{d}\) are to be considered equivalent if \(y=e^{i\theta }x'\) for some \(\theta \in [0,2\pi )\). A natural distance measure between two equivalence classes is given by

$$\begin{aligned} d(x, x')=\min _{\theta \in [0,2\pi )} \left\Vert x-e^{i\theta } x' \right\Vert _{2}. \end{aligned}$$
(1)

A solution to the angular synchronization problem is thus any vector for which this expression vanishes.

In many applications such as certain algorithms for ptychography [9, 14, 22,23,24], noisy observations of only a strict subset of the set of differences are available. To mathematically describe this restriction we will work with the quantity

$$\begin{aligned} E := \{ (\ell ,j) \in \{1,\dots , d\}\times \{1,\dots , d\}: \text {noisy } \varphi _{\ell } - \varphi _{j} \text { is known and } j\ne \ell \}. \end{aligned}$$

In these ptychography applications, one also encounters a version of the problem that is generalized in yet another way. Namely, the entries of the vector y to be recovered are not all of modulus 1 (but still assumed to be known). The measurements are still of the form \(y_{j} y_k^{*}\) affected by noise. Clearly this generalized problem can be directly reduced to the angular synchronization problem in its original form by just dividing each measurement by the product of the known magnitudes of the associated entries, but one should note that the noise is also affected by this transformation.

We will now present a short overview of the major developments in angular synchronization. The approaches to the problem mainly split into two dominant branches, which essentially differ by the underlying noise model.

In the first branch, it is assumed that the observed pairwise products of the unknown phase factors are affected by independent Gaussian noise. Typically these results work with \(E=\{(\ell ,j) | j,\ell , \in \{1,\dots , d\}, j\ne \ell \}\), i.e., assuming control of the full set of pairwise differences. That is, the matrix of measurements \(\hat{X}\) is given by

$$\begin{aligned} \hat{X} = {x} {x}^{*} + \sigma \Delta , \end{aligned}$$
(2)

where \(\Delta \) is a \(d \times d\) Hermitian matrix with \(\Delta _{\ell ,\ell } = 0\) and \(\Delta _{\ell ,j},\ \ell > j,\) being independent centered complex Gaussian random variables with unit variance and \(\sigma >0\). This noise model allows to perform maximum likelihood estimation which leads to the least squares problem (LSP)

$$\begin{aligned} \min _{z \in \mathbb {T}^{d}} \frac{1}{2} \sum _{ (\ell , j ) \in E} w_{\ell ,j} |z_{\ell } - \hat{X}_{\ell ,j} z_{j} |^{2}, \end{aligned}$$
(3)

with weights \(w_{\ell ,j} = 1/\sigma ^{2}\), \(\ell \ne j\) and \(w_{\ell ,\ell } = 0\). Due to the condition \(z \in \mathbb {T}^{d}\), the LSP (3) is NP-hard [29]. Therefore, Singer [27] proposed two possible relaxations, the eigenvector relaxation (ER) and the semidefinite convex relaxation (SDP). Both will be discussed in Sect. 2.

By a closer inspection of the maximum likelihood approach, Bandeira, Boumal and Singer [2] were able to establish an error bound for the solution of the LSP (3) which holds with high probability. In addition the authors gave sufficient conditions on the standard deviation \(\sigma \) under which the SDP recovers the solution of the LSP (3). As an alternative to the relaxation approaches Boumal [4] proposed an iterative approach called generalized power method (GPM) to solve the LSP directly. He showed that the method converges to the minimizer of (3). Later Liu et al. [18] provided additional details about the convergence rate of the GPM. In subsequent work [30], Zhong and Boumal extended the admissible range of \(\sigma \) providing near-optimal error bounds for solutions of the LSP, ER and the GPM and improved the sufficient conditions for the tightness of the SDP relaxation. Another iterative approach to angular synchronization based on cycle consistency and message passing was proposed in [16] and was connected to the iteratively reweighted least squares algorithms in [26].

For the variant of the angular synchronization problem where the vector y to be recovered does not only have entries of modulus one, this theory does not directly apply, as the added Gaussian noise will encounter entrywise rescaling and hence no longer have the same variance for all entries. The least squares approach will, however, have a natural generalization. In analogy to (3) where all differences are multiplied by the inverse of the variance of the i.i.d. noise variables, one weights each difference with the inverse of the variance of the corresponding rescaled noise term, which yields a linear scaling in \(|y_{j} y_{\ell }|\). While this method is not covered by the theory just discussed, it serves as an important motivation for the approach of this paper.

The second branch of development for the angular synchronization problem works with the model that the angular differences rather than the associated phase factors are affected by noise. This version of the problem has also been studied for more general sets E. Consequently, the matrix of measurements \(\hat{X}\) in this model is given by the entries

$$\begin{aligned} \hat{X}_{\ell ,j}= {\left\{ \begin{array}{ll} e^{i (\varphi _{\ell } - \varphi _{j} + \eta _{\ell ,j}) }, &{} (\ell ,j) \in E,\\ 0, &{} (\ell ,j) \notin E, \end{array}\right. } \end{aligned}$$
(4)

where \(\eta _{\ell ,j}\) corresponds to the angular noise, or

$$\begin{aligned} \hat{Y}_{\ell ,j}= {\left\{ \begin{array}{ll} |y_{\ell } y_{j}| e^{i (\varphi _{\ell } - \varphi _{j} + \eta _{\ell ,j}) }, &{} (\ell ,j) \in E,\\ 0, &{} (\ell ,j) \notin E, \end{array}\right. } \end{aligned}$$
(5)

when the entries to be recovered are not of unit modulus.

Under this model, random noise is somewhat harder to study due to the multiplicative structure. Consequently, most works employ an adversarial noise model making no assumptions on the distribution of the noise. That is, maximum likelihood estimation is no longer applicable. Nevertheless, weighted least squares minimization (3) can still be applied without the statistical justification; and a natural choice for the weights remains \(w_{j,k} = |y_{j} y_k|\). This is in line with the observation that if for two vectors y and \({\tilde{y}}\) the modulus of each entry agrees, then smaller entries play less of a role in determining distance in the sense of (1). Moreover, the expansion

$$\begin{aligned} d(y, {\tilde{y}})= \min _{\theta \in [0,2\pi )} \left\Vert {\tilde{y}}-e^{i\theta } y \right\Vert _{2}^{2} = \min _{\theta \in [0,2\pi )} \sum _{\ell =1}^{d} |y_{\ell }|^{2} |{\tilde{x}}_{\ell } -e^{i\theta } x_{\ell } |^{2}, \end{aligned}$$

motivates to consider the recovery guarantees for the scaled norms of the form

$$\begin{aligned} \sum _{\ell =1}^{d} |S_{\ell ,\ell }|^{2} |{\tilde{x}}_{\ell } -e^{i\theta } x_{\ell } |^{2} = \left\Vert S ({\tilde{x}}-e^{i\theta } x ) \right\Vert _{2}^{2}, \end{aligned}$$
(6)

with \(d \times d\) diagonal scaling matrix S taking the role of the squared magnitudes \(|y_{\ell }|^{2}\). For ptychography applications, inclusion of these weights have also been shown numerically to be beneficial for the overall reconstruction (see Section 4.4 in [24]).

For the multiplicative noise model, several error bounds have been presented in the literature. Iwen et al. [14] worked with the unweighted LSP (3) and established recovery guarantees for the ER based on Cheeger’s inequality [3]. Later in [24], Preskitt developed error bounds for the unweighted case of the LSP. He additionally developed alternative bounds for any selection of weights in the problem (3) and provided sufficient conditions for tightness of the SDP relaxation.

In the literature, the SDP relaxation is studied more often, as under certain conditions it recovers a true solution of the optimization problem (3). On the other hand, it is computationally heavy and above a certain noise level the relaxation is no longer tight, so SDP fails to return the exact solution of the LSP. Thus beyond this threshold, no recovery guarantees for SDP are available. ER, in contrast, is much faster, especially for large dimension d, and its recovery guarantees, where available, are not restricted by tightness assumptions. Before this paper, however, such guarantees were only available for the unweighted scenarios, even though SDP and ER exhibit similar reconstruction accuracy in numerical experiments.

In this paper, we close this gap, providing recovery guarantees for weighted angular synchronization via eigenvector relaxation from measurements of the form (4), following the setup of [14, 24]. In addition, obtained results are generalized to include bounds for reconstruction error with scaled norms (6). We numerically demonstrate that our guarantees even are tighter than the best known guarantees for the unrelaxed problem LSP. Along the way, we also establish improved bounds for LSP.

2 Problem Setup and Previous Results

We study the problem of recovering a vector \(x=(x_{j})_{j=1}^{d}\) with unimodular entries \(x_{j}=e^{i\varphi _{j}}\) from partial and possibly noisy information on the pairwise differences \(x_{\ell } x_{j}^*=e^{i(\varphi _{\ell }-\varphi _{j})}\) for all pairs \((\ell ,j)\) in some set \(E \subset [d]\times [d]\). Here we used the notation \([n]=\{1,\dots , n\}\). As we consider angular noise, the noisy observations will take the form \(e^{i(\varphi _{\ell }-\varphi _{j}+\eta _{\ell ,j})}\), where \(\eta _{\ell ,j}\in (-\pi , \pi ]\) is the angular noise.

The phase factors corresponding to the true pairwise differences will be arranged as a matrix \(X\in {\mathbb {C}}^{d\times d}\), the noisy observation as a matrix \({\hat{X}}\in {\mathbb {C}}^{d\times d}\), that is, the entries of these matrices are given by

$$\begin{aligned} X_{\ell ,j}={\left\{ \begin{array}{ll} e^{i(\varphi _{\ell }-\varphi _{j})} &{} (\ell ,j)\in E,\\ 0,&{} (\ell ,j)\notin E,\end{array}\right. },\qquad \hat{X}_{\ell ,j}={\left\{ \begin{array}{ll} e^{i(\varphi _{\ell }-\varphi _{j}+\eta _{\ell ,j})}&{} (\ell ,j)\in E,\\ 0,&{} (\ell ,j)\notin E.\end{array}\right. } \end{aligned}$$
(7)

By \(\mathcal {H}^{d}\) we denote the space of all \(d \times d\) Hermitian matrices. With \(N \in \mathcal {H}^{d}\) with entries \(N_{\ell ,j}=e^{i\eta _{\ell ,j}}\) denoting the matrix rearrangement of the multiplicative noise, one observes that these two matrix representations are related via \(\hat{X}=X\circ N\), where for two matrices \(A,B\in \mathbb {C}^{d\times d}\), \(A\circ B\) denotes their Hadamard product as defined by \((A\circ B)_{n,m}=A_{n,m}\ B_{n,m}\).

As a measure for the noise level, we will use a Frobenius norm or a spectral norm of the difference \(X-{\hat{X}}\) or its modified versions; recall that for \(A \in \mathbb {C}^{d \times d}\) the Frobenius norm and the spectral norm are given by

$$\begin{aligned} \left\Vert A \right\Vert _{F} := {\text {tr}}\left( A^{*}A \right) \text { and } \left\Vert A \right\Vert _{2 \rightarrow 2} := \max _{ \left\Vert v \right\Vert _{2}=1 } \left\Vert Av \right\Vert _{2} \text {, respectively}. \end{aligned}$$

The quality of reconstruction will be measured in the Euclidean norm on \(\mathbb {C}^{d}\), given by

$$\begin{aligned} \left\Vert v \right\Vert _{2} := \left( \sum _{i=1}^{d} |v_i|^{2} \right) ^{1/2}. \end{aligned}$$

For the proofs we will also need the supremum norm \( \left\Vert v \right\Vert _{\infty } := \max _{1\le j \le d} |v_{j}|\).

We will write \(A \succeq 0\) if the matrix A is positive semidefinite, that is

$$\begin{aligned} v^{*} A v \ge 0, \text { for all } v \in \mathbb {C}^{d}. \end{aligned}$$

The \({\text {sgn}}\) operator is defined for \(\alpha \in {\mathbb {C}}\) as

$$\begin{aligned} {\text {sgn}}(\alpha ) := {\left\{ \begin{array}{ll} \alpha /|\alpha |, &{} \alpha \not = 0,\\ 0, &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

This operator is extended to any matrix space \({\mathbb {C}}^{d\times d^\prime }\) by entrywise operation, i.e. for any \(A\in {\mathbb {C}}^{d\times d^\prime }\) we have

$$\begin{aligned} {\text {sgn}}(A)=\big ({\text {sgn}}(A_{n,m})\big )_{n,m=1}^{d,d^\prime }. \end{aligned}$$

Similarly to previous works, our analysis is based on a graph theoretic interpretation. Namely, the matrices X and \({\hat{X}}\) can be seen as edge weight matrix of a weighted undirected graph \(G=(V,E,W)\). Consequently, one has \(|V|=d\), and we can identify V with \([d]=\{1,\dots , d\}\). The set of edges E is naturally identified with the index set of the observed noisy angular differences introduced above. It directly follows from the problem setup that the weight function \(W:V\times V\rightarrow [0,\infty )\) must satisfy symmetry \(W(v,v^\prime )\ge 0,\ W(v,v^\prime )=W(v^\prime ,v)\), and the graph does not allow loops so that \(W(v,v)=0\).

To analyze this graph, we need some basic concepts from graph theory. The adjacency matrix \(A_{G}\) of G is given by

$$\begin{aligned} (A_{G})_{\ell ,j}={\left\{ \begin{array}{ll} 1, &{} (\ell ,j)\in E,\\ 0,&{} (\ell ,j)\notin E.\end{array}\right. } \end{aligned}$$

With this notation, one obtains the compact expression \(X= A_{G} \circ xx^{*}\). In case \(W\equiv 1\) on its support, i.e., \(W=A_{G}\), we speak of G as an unweighted graph.

The degree of the vertex \(\ell \) is defined as

$$\begin{aligned} \deg (\ell ) := \sum _{(j,\ell )\in E} w_{\ell ,j}, \end{aligned}$$

and the corresponding degree matrix is the diagonal matrix

$$\begin{aligned} D={\text {diag}}\big (\deg (\ell )\big ). \end{aligned}$$

The Laplacian of the graph G is given by

$$\begin{aligned} L_{G}=D-W. \end{aligned}$$

Observe that, as the graph is undirected, the Laplacian is symmetric. Moreover, since \(w_{\ell ,j}\ge 0\) we have

$$\begin{aligned} u^{*} L_{G} u=\frac{1}{2}\sum _{\ell ,j} w_{\ell ,j}\ |u_{\ell } - u_{j}|^{2}\ge 0 \end{aligned}$$

for all \(u\in \mathbb {C}^{d}\). Hence the Laplacian is positive semidefinite and therefore has a spectrum consisting of non-negative real numbers, which we denote by \(\lambda _{j}\) with indices j arranged in ascending order, i.e.,

$$\begin{aligned} 0=\lambda _1\le \lambda _2\le \cdots \le \lambda _d. \end{aligned}$$

Here the first equality follows from the observation that the vector \(\mathbb {1}=(1,\dots ,1)^T\) satisfies \(L_{G}\mathbb {1}=0\). The spectral gap of G is defined as \(\tau _{G}=\lambda _2\). A graph G is connected if and only if \(\tau _{G}>0 \), see [6]. In that case, the null space of \(L_{G}\) is spanned by \(\mathbb {1}\).

Besides the Laplacian \(L_{G}\) the normalized Laplacian \(L_N\) of G is often used. It is defined as

$$\begin{aligned} L_N=D^{-1/2}\ L_{G}\ D^{-1/2}. \end{aligned}$$

Its spectrum consists of non-negative real numbers as well and we write \(\tau _N\) for its second smallest eigenvalue \(\lambda _2(L_N)\).

The data dependent Laplacians associated to X and \(\hat{X}\) are defined as

$$\begin{aligned} L=D-W\circ X,\quad \text { and }\quad \hat{L}=D-W\circ \hat{X}, \text { respectively.} \end{aligned}$$

Note that under the multiplicative noise model used in this paper, both these Laplacians are positive semidefinite matrices by Gershgorin’s circle theorem.

The data dependent Laplacian \({\hat{L}}\) corresponding to the noisy observations allows for a compact representation of the least squares problem (3) at the core of our recovery method. Indeed, observe that

$$\begin{aligned} \min _{z \in \mathbb {T}^{d}} \frac{1}{2}\sum _{ (\ell , j ) \in E} w_{\ell ,j} |z_{\ell } - \hat{X}_{\ell ,j} z_{j} |^{2} = \min _{z \in \mathbb {T}^{d}} z^{*} (D - \hat{X} \circ W) z = \min _{z \in \mathbb {T}^{d}} z^{*} \hat{L} z. \end{aligned}$$
(8)

Due to the quadratic constraint \(z\in \mathbb {T}^{d}\) the quadratic minimization problem (8) is non-convex and thus NP-hard in general. One way to obtain a feasible problem is to relax the constraint in (8) to \( \left\Vert z \right\Vert _{2}^{2} = d\) and obtain

$$\begin{aligned} \min _{ \left\Vert z \right\Vert _{2}^{2} = d} z^{*} \hat{L} z. \end{aligned}$$
(9)

This is nothing else but the determination of the smallest eigenvalue of the matrix \(\hat{L}\) and can be solved efficiently. We will refer to (9) as eigenvector relaxation (ER).

An error bound for the ER based reconstruction was given by Iwen et. al. in [14] for the case of unweighted graphs. Their proof is based on the Cheeger inequality that is only available for the normalized Laplacian, which is why the minimization problem in their theorem has a different normalization than (9). In the special case that \(\deg (\ell )\) is a constant for all \(\ell \) (as in [14]), the two normalizations agree up to a constant. Using the terminology introduced above their result reads as follows.

Theorem 1

([14], Theorem 3], [24], Theorem 4]) Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E)\) is an undirected connected and unweighted graph with \(\tau _N > 0\). Let \(\tilde{z} \in \mathbb {C}^{d}\) be the minimizer of

$$\begin{aligned} \min _{ \left\Vert z \right\Vert _{2}^{2} = d} z^{*} D^{-1/2} \hat{L} D^{-1/2} z \end{aligned}$$

and let \(\tilde{x} = {\text {sgn}}(\tilde{z})\). Then,

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert \tilde{x} - e^{i\theta } {x} \right\Vert _{2} \le 19 \frac{ \left\Vert \hat{X} - {X} \right\Vert _{F}}{\tau _N \sqrt{\min \limits _{i \in V} \deg (i) } }. \end{aligned}$$

An alternative approach is based on the idea of lifting the problem to the matrix space. It makes use of the relation

$$\begin{aligned} z^{*} \hat{L} z=\sum _{\ell ,j} z_{\ell }^{*} \ \hat{L}_{\ell ,j}\ z_{j}={\text {tr}}(\hat{L}\ z z^*). \end{aligned}$$

With this the minimization problem (8) transforms into

$$\begin{aligned}&\min _{ Z \in \mathcal {H}^{d} } {\text {tr}}( \hat{L} Z) \nonumber \\&s.t.~ Z_{ii} = 1,\nonumber \\&Z \succeq 0, \nonumber \\&{\text {rank}}(Z) = 1. \end{aligned}$$
(10)

The class of minimization problems with explicit rank constraints is known to include many NP-hard instances [8], Chapter 2], so a common strategy is to perform a semi-definite relaxation. For (10), the following relaxation has been proposed in [11].

$$\begin{aligned}&\min _{ Z \in \mathcal {H}^{d} } {\text {tr}}( \hat{L} Z) \nonumber \\&s.t.~ Z_{ii} = 1,\nonumber \\&Z \succeq 0. \end{aligned}$$
(11)

We will refer to this minimization problem as SDP. Note that if Z meets the rank condition in (10) one obtains that \(Z= z z^{*}\), where z is a solution of (8). Without the rank condition, however, the solution to (11) may have higher rank. In this case the methods outputs the phase factors corresponding to the entries of the eigenvector associated to the largest eigenvalue as an approximation for the solution of (8) [28].

As it was mentioned before, the error bound is commonly derived for the solution of the LSP and then applied to the solution of the SDP when it has rank 1. For unweighted graphs, a first result on recovery guarantees of the SDP has been established by Preskitt [24]. Adjusted to our terminology his result reads as follows.

Theorem 2

([24], Theorem 9) Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E)\) is an undirected and unweighted graph with \(\tau _{G}>0\). Let \(\tilde{x} \in \mathbb {T}^{d}\) be the minimizer of the LSP (8). Then,

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert \tilde{x} - e^{i\theta } {x} \right\Vert _{2} \le 2 \frac{ \left\Vert \hat{X} - {X} \right\Vert _{F}}{\sqrt{ \tau _{G} }}. \end{aligned}$$

In general, Theorem 2 exhibits better performance than Theorem 1 and sparks a need for an ER counterpart. For a detailed comparison of these statements, we refer reader to Section 4.3.2 of [24]. The first results addressing a generalization to the important case of weighted graphs have been derived by Preskitt [24]. The following formulations have again been adjusted to our notation.

Theorem 3

[24], Proposition 12 and Theorem 8]) Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E,W)\) weighted graph with \(\tau _{G}>0\). Let \(\tilde{x} \in \mathbb {T}^{d}\) be the minimizer of the LSP (8). Then,

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert \tilde{x }- e^{i\theta } {x} \right\Vert _{2} \le 2 \sqrt{ \frac{d \left\Vert W \circ (\hat{X} - {X}) \right\Vert _{2 \rightarrow 2}}{ \tau _{G} } }, \end{aligned}$$
(3.A)

and

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert \tilde{x} - e^{i\theta } {x} \right\Vert _{2} \le 4 \sqrt{d}\ \frac{ \left\Vert W \circ (\hat{X} - {X}) \right\Vert _{2 \rightarrow 2}}{ \tau _{G} }. \end{aligned}$$
(3.B)

As the square root in (3.A) produces slow convergence as the noise diminishes, i.e. \({\hat{X}}\) approaches X, in the many cases bound (3.B) outperforms (3.A). For unweighted graphs, as we numerically explore in Sect. 5, the bound of Inequality (3.B) is similar to that of Theorem  2 and superior to the bound of Theorem 1 for ER in many cases. They are, however, only valid for SDP when the relaxation is tight. The following lemma provides sufficient condition for the tightness of the SDP relaxation.

Lemma 1

([24], Lemma 16]) Suppose \(\tilde{x}\in {\mathbb {T}}^{d}\) is a minimizer of (8) and let \(\tilde{L}=D-W\circ \tilde{x}\ \tilde{x}^{*}\). If

$$\begin{aligned} \left\Vert \hat{L} - {\tilde{L}} \right\Vert _{F} < \frac{\tau _{G}}{1 + \sqrt{d}}, \end{aligned}$$

then \(\tilde{x} \tilde{x}^{*}\) is minimizer of (11).

As the spectral gap \(\tau _{G}\) is typically rather small, as compared to the dimension d, tightness is guaranteed only for very small noise levels. In fact, our numerical simulations in Sect. 5 show that the SDP relaxation is indeed not tight in many cases. In contrast, the recovery guarantees for ER provided by Theorem 1 are applicable independently of the tightness of the relaxation, but restricted to the unweighted graphs.

3 Improved Error Bounds

The main contribution of this paper concerns recovery guarantees for weighted angular synchronization via eigenvector relaxation, which are often stronger than even the best known bounds for the unrelaxed problem and do not require any a priori bound for the error to ensure tightness of the relaxation. Along the way, we derive similar error bounds for the solution of the least squares problem, which are exactly analogous to those provided by Theorem 2 in the unweighted case. The superior scaling of our error bounds as compared to Theorem 3 is also confirmed by numerical simulations in Sect. 5. We first state our result in general form, before discussing three special cases of interest.

Theorem 4

Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V ,E,W)\) is a weighted graph with \(\tau _{G}>0\). Let \(\tilde{x} \in \mathbb {T}^{d}\) be the minimizer of the LSP (8) and z be the minimizer of the ER (9). Set \(R\in {\mathbb {C}}^{d\times d}\) as \(R_{\ell ,j}=W_{\ell ,j}^{1/2}\). Then,

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert \tilde{x}- e^{i\theta } {x} \right\Vert _{2} \le 2\ \frac{ \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}}{ \sqrt{ \tau _{G} } }, \end{aligned}$$
(12)

and

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert {\text {sgn}}(z) - e^{i\theta } {x} \right\Vert _{2} \le 2\ \frac{ p_z \ \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}}{ \sqrt{ \tau _{G} } }, \end{aligned}$$
(13)

with tightness penalty \(p_z := \sqrt{2 + 2 \left\Vert z \right\Vert _{\infty }^{2} }\).

The term \( \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}\) is presented in a form of weighted difference of the true and measured pairwise differences. However, it also has an alternative interpretation as a value of the empirical least squares objective evaluated at the vector x, namely

$$\begin{aligned} \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2} = 2 {x}^{*} \hat{L} {x}, \end{aligned}$$

which represents the gap between the value of the noise-free objective (which equals to 0) and the noisy objective at global minimum of the former one. In addition, we note that \( \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}\) can be estimated from above as

$$\begin{aligned} \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F} \le \sqrt{ \left\Vert W \circ (\hat{X} - {X}) \right\Vert _{F} \ \left\Vert \hat{X} - {X} \right\Vert _{F} }. \end{aligned}$$
(14)

The tightness penalty \(p_z\) in the bound varies between 2 when the relaxation is tight and \(\sqrt{2+2d}\) in the worst case. In numerical experiments, however, (see Fig. 3), we do not observe this difference, which suggests that this dimensional factor may be a proof artefact. For the random model (2), this dimension independence has been established in [30], but the proof techniques do not carry over to our setting in a straightforward way, which is why we leave the investigation to future work. For completeness, we note that also Inequality (3.B) carries over to ER, as stated in the following theorem. The proof is directly analogous to the one in [24], which is why we omit the details.

Theorem 5

Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E,W)\) weighted graph with \(\tau _{G}>0\). Let z be the minimizer of the ER (9). Then,

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert {\text {sgn}}(z) - e^{i\theta } {x} \right\Vert _{2} \le 8 \sqrt{d}\ \frac{ \left\Vert W \circ (\hat{X} - {X}) \right\Vert _{2 \rightarrow 2}}{ \tau _{G} }. \end{aligned}$$
(15)

The first consequence of interest concerns the unweighted case, where the noise norm in Theorem 4 simplifies to \( \left\Vert \hat{X} - {X} \right\Vert _{F}\) as in Theorem 2.

For LSP, Theorems 4 and 2 yield the exact same bound, the bounds in Theorem 4 for ER only differ by tightness penalty term \(p_{z}\).

Corollary 1

Suppose \(G = (V,E)\) is an unweighted graph with \(\tau _{G}>0\). Let \(z \in \mathbb {C}^{d}\) be the minimizer of the ER (9). Then

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert {\text {sgn}}(z) - e^{i\theta } {x} \right\Vert _{2} \le 2 \frac{ p_z \ \left\Vert {X} - \hat{X} \right\Vert _{F}}{\sqrt{\tau _{G}} }. \end{aligned}$$

Our next examples are related to the ptychography problem. We remind the reader that the goal of ptychography is to estimate the a signal \(y \in \mathbb {C}^{d}\) from phaseless measurements through localized masks. A recent method for recovering the signal y from such observation is the BlockPR algorithm by Iwen et al. [14], see [9, 22,23,24] for follow-up works developing this algorithm further that also rely on weighted angular synchronization. The BlockPR algorithm proceeds by combining neighboring masks to obtain estimates for the products of entries located close to each other. In mathematical terms, this procedure yields an approximation of the squared absolute values of the entries (so these can be assumed to be approximately known) and a noisy version of \(T_\delta ( y y^{*})\), where \(\delta \) is the size of the mask and \(T_\delta \) is the restriction operator mapping a matrix to its entries indexed by the set

$$\begin{aligned} E_\delta =\{ (\ell ,j) ~|~ \ell \ne j\in [d], \text { and } \left| \ell - j \right| < \delta \text { or } |\ell -j| > d - \delta \} \end{aligned}$$
(16)

corresponding to the \(2\delta -1\) central sub- and superdiagonals, excluding the entries on the main diagonal. Thus the resulting measurements exactly correspond to (5) for \(E=E_\delta \), which is why weighted angular synchronization is the natural method of choice. The weights in this problem are given by the matrix \(yy^{*}\) restricted to the index set E, which yields the setup of the following corollary. We note that in the next two statements Kroneker’s delta \(\delta _{\ell ,j}\) is 1 when \(\ell =j\) and 0 otherwise.

Corollary 2

Consider a weighted graph \(G=(V,E,W)\) with whose weight matrix W is defined as follows. Let \(y\in {\mathbb {C}}^{d}\) with \({\text {sgn}}(y)=x\). Define matrices \(Y=(I+A_{G})\circ y\ y^*\) and \({X} = {\text {sgn}}({Y})\). Let M and \(\hat{X}\) be the matrices containing the perturbed magnitudes and phases of Y, respectively, so that \(M \approx |{Y}|\), N has unimodular entries and \(\hat{X} = {X} \circ N\) and set \(\hat{Y}= M \circ \hat{X}\). Consider the weight matrix W with entries given by \(w_{\ell ,j}=|\hat{Y}_{\ell ,j}|\ (1-\delta _{\ell ,j})\) and assume that \(\tau _{G}>0\). Let \(\tilde{x}\) be a minimizer of (8) and let z be the minimizer of (9). Then we have

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert \tilde{x} - e^{i\theta } {x} \right\Vert _{2} \le 2 \sqrt{2} \frac{ \sqrt{ \left\Vert \hat{Y} - {Y} \right\Vert _{F} \ \left\Vert \hat{X} - {X} \right\Vert _{F} }}{ \sqrt{ \tau _{G} } }, \end{aligned}$$

and

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert {\text {sgn}}(z) - e^{i\theta } {x} \right\Vert _{2} \le 2 \sqrt{2} \frac{p_z \ \sqrt{ \left\Vert \hat{Y} - {Y} \right\Vert _{F} \ \left\Vert \hat{X} - {X} \right\Vert _{F} } }{ \sqrt{ \tau _{G} } }. \end{aligned}$$

In the next statement the set-up is analogous to the previous corollary but instead of having weights defined by \(|\hat{Y}_{\ell ,j}|\) we work with \(|\hat{Y}_{\ell ,j}|^{2}\).

Corollary 3

Consider a weighted graph \(G=(V,E,W)\) whose weight matrix W is defined as follows. Let \(y\in {\mathbb {C}}^{d}\) with \({\text {sgn}}(y)=x\). Define matrices \(Y=(I+A_{G})\circ y\ y^*\) and \({X} = {\text {sgn}}({Y})\). Let M and \(\hat{X}\) be the matrices containing the perturbed magnitudes and phases of Y, respectively, so that \(M \approx |{Y}|\), N has unimodular entries and \(\hat{X} = {X} \circ N\) and set \({\hat{Y}}= M \circ \hat{X}\). Consider the weight matrix W with entries given by \(w_{\ell ,j}=|\hat{Y}_{\ell ,j}|^{2}\ (1-\delta _{\ell ,j})\) and assume that \(\tau _{G}>0\). Let \(\tilde{x}\) be a minimizer of (8) and let z be the minimizer of (9). Then we have

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert \tilde{x} - e^{i\theta } {x} \right\Vert _{2} \le 4 \frac{ \left\Vert \hat{Y} - {Y} \right\Vert _{F}}{ \sqrt{ \tau _{G} } }, \end{aligned}$$

and

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert {\text {sgn}}(z) - e^{i\theta } {x} \right\Vert _{2} \le 4 \frac{p_z \ \left\Vert \hat{Y} - {Y} \right\Vert _{F} }{ \sqrt{ \tau _{G} } }. \end{aligned}$$

The main benefit of Corollary 3 is the absence of the difference \(\hat{X} - {X}\) in the bound. It is especially handy in the ptychographic setup, where estimation of the phase difference error is a complicated task while an upper bound for \( \left\Vert \hat{Y} - {Y} \right\Vert _{F}\) is available.

We note that if the entry \(y_{j}\) is small, it will cause \(w_{\ell ,j}\) and \(w_{j,\ell }\) to be small, as well as \(\deg (j)\). This will result into the node j of the graph G being poorly connected to the rest of the graph and hence the spectral gap \(\tau _{G}\) will be close to zero, making our recovery guarantees useless. A possible cure for such scenarios is an additional preprocessing step, where nodes with small degrees are removed from the graph before the angular synchronization is solved. It would allow to stabilize the spectral gap of the pruned graph and recover phases of the ”strongly” connected nodes. For the discarded nodes, the phase can either be assigned randomly or to some fixed value, as they have small impact on the data fidelity. Unfortunately, the phase error corresponding to such assigned entries cannot be bounded better than by the triangle inequality,

$$\begin{aligned} |{\tilde{x}}_{j} - e^{i\theta } x_{j}| \le 2. \end{aligned}$$

On the other hand, if the scaled norm error is considered, it would naturally reduce the impact of such crude bound by incorporating the magnitude information,

$$\begin{aligned} |y_{j}| |{\tilde{x}}_{j} - e^{i\theta } x_{j}| \le 2 |y_{j}|. \end{aligned}$$

Thus, the recovery guarantees for the scaled norms would be required for the ”strongly” connected part of the graph, which we provide in the next statement.

Theorem 6

Let \({X}\) and \(\hat{X}\) be defined as in (7). Suppose that \(G = (V,E,W)\) is a weighted graph with Laplacian \(L_{G}\) and spectral gap \(\tau _{G}>0\). Let S be a \(d \times d\) diagonal matrix with \(S_{\ell ,\ell } >0\). Let \(\tilde{x} \in \mathbb {T}^{d}\) be the minimizer of the LSP (8) and z be the minimizer of the ER (9). Set \(R\in {\mathbb {C}}^{d\times d}\) as \(R_{\ell ,j}=W_{\ell ,j}^{1/2}\). Then,

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert S(\tilde{x}- e^{i\theta } {x}) \right\Vert _{2} \le 2 \frac{ \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}}{ \sqrt{ \lambda _2( S^{-1} L_{G} S^{-1}) } }, \end{aligned}$$
(17)

and

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert S({\text {sgn}}(z) - e^{i\theta } {x}) \right\Vert _{2} \le 2 \frac{ \sqrt{{\text {tr}}S^{2}} }{ \left\Vert S z \right\Vert _{2}} \frac{ p_z \ \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}}{ \sqrt{ \lambda _2( S^{-1} L_{G} S^{-1}) } }. \end{aligned}$$
(18)

Note that in addition to the penalty \(p_z\), one encounters a normalization factor \(\sqrt{{\text {tr}}S^{2}}/ \left\Vert S z \right\Vert _{2}\), which we suggest may be an artifact of the proof. The condition \(\tau _{G}>0\) is required to control the nullspace of the matrix \(S^{-1} L_{G} S^{-1}\), so we can guarantee that \(\lambda _2( S^{-1} L_{G} S^{-1})>0\).

4 Proofs

Proof of Theorem 4

We will proceed by establishing the following four inequalities.

$$\begin{aligned}&\min _{\theta \in [0, 2 \pi )} \left\Vert {\text {sgn}}(z) - e^{i\theta } {x} \right\Vert _{2}^{2} \le \frac{4}{\tau _{G}}\sum _{(\ell ,j) \in E} w_{\ell ,j} | {x}_{\ell }^{*} z_{\ell } - {x}_{j}^{*} z_{j} |^{2} , \end{aligned}$$
(19)
$$\begin{aligned}&\sum _{(\ell ,j) \in E} w_{\ell ,j} | {x}_{\ell }^{*} z_{\ell } - {x}_{j}^{*} z_{j} |^{2} \le c_z^{2} \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}, \end{aligned}$$
(20)
$$\begin{aligned}&\min _{\theta \in [0, 2 \pi )} \left\Vert \tilde{x} - e^{i\theta } {x} \right\Vert _{2}^{2} \le \frac{1}{\tau _{G}}\sum _{(\ell ,j) \in E} w_{\ell ,j} |{x}_{\ell }^{*} \tilde{x}_{\ell } - {x}_{\ell }^{*} \tilde{x}_{j}|^{2}, \end{aligned}$$
(21)

and

$$\begin{aligned} \sum _{(\ell ,j) \in E} w_{\ell ,j} |{x}_{\ell }^{*} \tilde{x}_{\ell } - {x}_{\ell }^{*} \tilde{x}_{j}|^{2} \le 4 \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}, \end{aligned}$$
(22)

Note that Inequality (21) has been derived in [24], we will nevertheless include a proof for completeness.

Equation (12) then follows by combining (21) and (22), Equation (13) is obtained as a combination of (19) and (20).

It remains to prove the four inequalities. To that extent, we recall that for \(\alpha ,\beta \in \mathbb {C}\) with \(|\beta | = 1\) we have

$$\begin{aligned} | {\text {sgn}}(\alpha ) - \beta |&\le |\alpha - \beta | + | {\text {sgn}}(\alpha ) - \alpha | = | \alpha - \beta | + | 1 - |\alpha | | \nonumber \\&= | \alpha - \beta | + | |\beta | - |\alpha | | \le 2| \alpha - \beta |. \end{aligned}$$
(23)

With help of this inequality we obtain that

$$\begin{aligned} \min _{\theta \in [0, 2 \pi ]} \left\Vert {\text {sgn}}(z) - e^{i\theta } {x} \right\Vert _{2}^{2}&= \min _{\theta \in [0, 2 \pi ]} \sum _{\ell = 1}^{d} | {\text {sgn}}(z_{\ell }) - e^{i\theta } {x}_{\ell }|^{2} \nonumber \\&\le 4 \min _{\theta \in [0, 2 \pi ]} \sum _{\ell = 1}^{d} | z_{\ell } - e^{i\theta } {x}_{\ell }|^{2} \nonumber \\&= 4 \min _{\theta \in [0, 2 \pi ]} \left\Vert z - e^{i\theta } {x} \right\Vert _{2}^{2}. \end{aligned}$$
(24)

Moreover, since \( \left\Vert x \right\Vert _{2}^{2}=d\) and \( \left\Vert z \right\Vert _{2}^{2}=d\) we have that

$$\begin{aligned} \left\Vert z - e^{i\theta } {x} \right\Vert _{2}^{2} = \left\Vert z \right\Vert _{2}^{2} + \left\Vert e^{i\theta } {x} \right\Vert _{2}^{2} - 2 {\text {Re}}\left( e^{- i \theta } {x}^{*} z \right) = 2d - 2 {\text {Re}}\left( e^{- i \theta } {x}^{*} z \right) . \end{aligned}$$
(25)

The right hand side is minimal if \({\text {Re}}\left( {x}^{*} z \right) \) is maximal and equal to \(| {x}^{*} z |\). Hence with \(e^{i \vartheta } := {\text {sgn}}({x}^{*} z)^{*}\) we arrive at

$$\begin{aligned} {\text {Re}}\big ( e^{- i \vartheta } {x}^{*} z \big ) = {\text {Re}}\big ( {\text {sgn}}({x}^{*} z)^{*} \cdot {\text {sgn}}({x}^{*} z) \cdot | {x}^{*} z | \big ) = | {x}^{*} z |, \end{aligned}$$
(26)

and thus

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert z - e^{i\theta } {x} \right\Vert _{2}^{2} = 2d - 2| {x}^{*} z |. \end{aligned}$$
(27)

The projection of \(e^{i\vartheta }z\) onto the orthogonal complement of x is given by

$$\begin{aligned} q := e^{- i \vartheta } z - \big \langle e^{- i \vartheta } z, \frac{{x}}{ \left\Vert {x} \right\Vert _{2}} \big \rangle \ \frac{{x}}{ \left\Vert {x} \right\Vert _{2}} = e^{- i \vartheta } z - \frac{1}{d} {|}{x}^{*} z|\ x, \end{aligned}$$

where we used that by (26), the inner product is real. Consequently, as \(q\perp x\), one has that by Pythagoras’ theorem

$$\begin{aligned} \left\Vert q \right\Vert _{2}^{2} = \left\Vert e^{- i \vartheta } z \right\Vert _{2}^{2} - \left\Vert \frac{1}{d} {x} |{x}^{*} z| \right\Vert _{2}^{2} = \left\Vert z \right\Vert _{2}^{2}- \frac{1}{d^{2}} \left\Vert {x} \right\Vert _{2}^{2} |{x}^{*} z|^{2} = d - \frac{1}{d} |{x}^{*} z|^{2}. \end{aligned}$$
(28)

With the Cauchy-Schwarz inequality and (27), this yields that

$$\begin{aligned} \left\Vert q \right\Vert _{2}^{2} = d - \frac{1}{d} |{x}^{*} z|^{2} \ge d - |{x}^{*} z| = \frac{1}{2} \left\Vert z - e^{i \vartheta } {x} \right\Vert _{2}^{2}. \end{aligned}$$

Recall that \(q \perp x\) and let us show that x spans the nullspace of the matrix \({L}\). Define the unitary matrix \(C={\text {diag}}\{x_1,\dots , x_d\}\), where \(x=(x_{j})_{j=1}^{d}\). Note that

$$\begin{aligned} W \circ X = W \circ (x x^{*}) = C W C^{*} \end{aligned}$$

and by unit modularity of \(x_i\)

$$\begin{aligned} D = D C C^{*} = C D C^{*}, \end{aligned}$$

where we used commutativity of diagonal matrices. This results in

$$\begin{aligned} {L} = D - W \circ {X} = C D C ^{*} - C W C^{*} =C L_{G} C^{*}, \end{aligned}$$

which shows in particular that the eigenvalues of \({L}\) and \(L_{G}\) coincide.

By assumption \(\tau _{G}>0\) and hence the null space of \(L_{G}\) is spanned by \(\mathbb {1}\). Thus the null space of L is spanned by x.

By definition q is orthogonal to the null space of \({L}\) which implies that

$$\begin{aligned} q^{*} {L} q = (e^{- i \vartheta } z - \frac{1}{d} {x} \cdot |{x}^{*} z|)^{*} {L} (e^{- i \vartheta } z - \frac{1}{d} {x} \cdot |{x}^{*} z|) = z^{*} {L} z. \end{aligned}$$

and

$$\begin{aligned} z^{*} {L} z = q^{*} {L} q \ge \lambda _2({L}) \left\Vert q \right\Vert _{2}^{2} \ge \frac{\lambda _2({L})}{2} \left\Vert z - e^{i \vartheta } {x} \right\Vert _{2}^{2}, \end{aligned}$$

Combining this with (24) and the fact that \(\lambda _2({L}) = \tau _{G}\) as well as the definition of L, we obtain both (21) and (19) by

$$\begin{aligned}&\min _{\theta \in [0, 2 \pi )} \left\Vert {\text {sgn}}(z) - e^{i\theta } {x} \right\Vert _{2}^{2} \le 4 \left\Vert z - e^{i \vartheta } {x} \right\Vert _{2}^{2} \le \frac{8}{\tau _{G}} z^{*} {L} z \\&\quad = \frac{8}{\tau _{G}} \frac{1}{2} \sum _{ (\ell , j ) \in E} w_{\ell ,j} |z_{\ell } - {X}_{\ell ,j} z_{j} |^{2} = \frac{4}{\tau _{G}} \sum _{ (\ell , j ) \in E} w_{\ell ,j} |z_{\ell } - {x}_{\ell } {x}_{j}^{*} z_{j} |^{2} \\&\quad = \frac{4}{\tau _{G}} \sum _{ (\ell , j ) \in E} w_{\ell ,j} | {x}_{\ell }^{*} z_{\ell } - {x}_{j}^{*} z_{j} |^{2}. \end{aligned}$$

Indeed, (21) follows by comparing the second and the last item in this chain of inequalities, and (19) by comparing the first and the last item.

Now we will prove inequality (20) and (22), again with largely identical proofs.

For simplicity of notation, we introduce the following auxiliary variables

$$\begin{aligned} g_{\ell } := {x}_{\ell }^{*} \tilde{x}_{\ell }, \ h_{\ell } := {x}_{\ell }^{*} z_{\ell }, \text { and } \Lambda _{\ell ,j} := {X}_{\ell ,j}^{*} \hat{X}_{\ell ,j} . \end{aligned}$$

We start by using \((\alpha + \beta )^{2} \le 2\alpha ^{2} + 2 \beta ^{2} \) to get

$$\begin{aligned} | h_{\ell } - h_{j} |^{2} =| h_{\ell } - \Lambda _{\ell ,j} h_{j} + \Lambda _{\ell ,j} h_{j} - h_{j} |^{2} \le 2| h_{\ell } - \Lambda _{\ell ,j} h_{j}|^{2} + 2|h_{j}|^{2} |\Lambda _{\ell ,j} - 1 |^{2}, \end{aligned}$$

and we further estimate

$$\begin{aligned} \sum _{ (\ell , j ) \in E} w_{\ell ,j} | h_{\ell } - h_{j} |^{2} \le 2 \sum _{ (\ell , j ) \in E} w_{\ell ,j} | h_{\ell } - \Lambda _{\ell ,j} h_{j}|^{2} + 2 \sum _{ (\ell , j ) \in E} w_{\ell ,j} |h_{j}|^{2} |\Lambda _{\ell ,j} - 1 |^{2}. \end{aligned}$$

For the first sum we observe that

$$\begin{aligned} | h_{\ell } - \Lambda _{\ell ,j} h_{j}| = | {x}_{\ell }^{*} z_{\ell } - {X}_{\ell ,j}^{*} \hat{X}_{\ell ,j} {x}_{j}^{*} z_{j}| = | {x}_{\ell }^{*} z_{\ell } - {x}_{\ell }^{*} {x}_{j} \hat{X}_{\ell ,j} {x}_{j}^{*} z_{j}| = | z_{\ell } - \hat{X}_{\ell ,j} z_{j}|, \end{aligned}$$

and obtain using (8) and the fact that z minimizes (9) that

$$\begin{aligned} \sum _{ (\ell , j ) \in E} w_{\ell ,j} | h_{\ell } - \Lambda _{\ell ,j} h_{j}|^{2}&= \sum _{ (\ell , j ) \in E} w_{\ell ,j} | z_{\ell } - \hat{X}_{\ell ,j} z_{j}|^{2} = 2 z^{*} \hat{L} z \le 2 {x}^{*} \hat{L} {x} \nonumber \\&= \sum _{ (\ell , j ) \in E} w_{\ell ,j} | {x}_{\ell } - \hat{X}_{\ell ,j} {x}_{j}|^{2} = \sum _{ (\ell , j ) \in E} w_{\ell ,j} | {x}_{\ell } {x}_{j}^{*} - \hat{X}_{\ell ,j} |^{2} \nonumber \\&= \sum _{ (\ell , j ) \in E} w_{\ell ,j} | {X}_{\ell ,j} - \hat{X}_{\ell ,j} |^{2} = \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}. \end{aligned}$$
(29)

For the second sum we use that \(|h_{j}| = |{x}_{j}^{*} z_{j}| = |z_{j}|\) and obtain

$$\begin{aligned} \sum _{ (\ell , j ) \in E} w_{\ell ,j} |h_{j}|^{2} |\Lambda _{\ell ,j} - 1 |^{2}&\le \max _{j \in [d] } |h_{j}|^{2} \sum _{ (\ell , j ) \in E} w_{\ell ,j} |\Lambda _{\ell ,j} - 1 |^{2} \\&= \left\Vert z \right\Vert _{\infty }^{2} \sum _{ (\ell , j ) \in E} w_{\ell ,j} |\Lambda _{\ell ,j} - 1 |^{2}. \end{aligned}$$

The last step is to notice that

$$\begin{aligned} \sum _{ (\ell , j ) \in E} w_{\ell ,j} |\Lambda _{\ell ,j} - 1 |^{2} = \sum _{ (\ell , j ) \in E} w_{\ell ,j} | {X}^{*}_{\ell ,j} \hat{X}_{\ell ,j} - 1 |^{2} = \\ \quad = \sum _{ (\ell , j ) \in E} w_{\ell ,j} | {X}_{\ell ,j} - \hat{X}_{\ell ,j} |^{2} = \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}. \end{aligned}$$

Putting everything together we arrive at

$$\begin{aligned} \sum _{ (\ell , j ) \in E} w_{\ell ,j} | h_{\ell } - h_{j} |^{2}&\le 2 \sum _{ (\ell , j ) \in E} w_{\ell ,j} | h_{\ell } - \Lambda _{\ell ,j} h_{j}|^{2} + 2 \sum _{ (\ell , j ) \in E} w_{\ell ,j} |h_{j}|^{2} |\Lambda _{\ell ,j} - 1 |^{2} \\&\le 2 \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2} + 2 \left\Vert z \right\Vert _{\infty }^{2} \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2} \\&= c_z^{2} \ \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}. \end{aligned}$$

This concludes the proof of Inequality (20).

For Inequality (22) we proceed analogously, with \({\tilde{x}}\) taking the role of z; the only difference is that in (29) we are using the fact that \({\tilde{x}}\) minimizes (8) rather than the fact that z minimizes (9). The bound for the second sum is simplified as compared to (20), as we replaced \( \left\Vert z \right\Vert _{\infty }\) by \( \left\Vert {\tilde{x}} \right\Vert _{\infty } = 1\).

The combined bound reads as

$$\begin{aligned} \sum _{ (\ell , j ) \in E} w_{\ell ,j} \ | g_{\ell } - g_{j} |^{2}&\le 2 \sum _{ (\ell , j ) \in E} w_{\ell ,j} \ |\Lambda _{\ell ,j} - 1 |^{2} + 2 \sum _{ (\ell , j ) \in E} w_{\ell ,j} \ | g_{\ell } - \Lambda _{\ell ,j} g_{j}|^{2} \\&\le 4 \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}. \end{aligned}$$

\(\square \)

Proof of Theorem 6

Theorem 6 has a similar proof as Theorem 4 and we will only highlight the main differences. The next three inequalities replace Inequalities (19) and (20).

$$\begin{aligned}&\min _{\theta \in [0, 2 \pi )} \left\Vert S({\text {sgn}}(z) - e^{i\theta } {x}) \right\Vert _{2}^{2} \le 4 \min _{\theta \in [0, 2 \pi )} \left\Vert S \left( \frac{ \sqrt{{\text {tr}}S^{2}} }{ \left\Vert S z \right\Vert _{2} } z - e^{i\theta } {x} \right) \right\Vert _{2}^{2}\end{aligned}$$
(30)
$$\begin{aligned}&\min _{\theta \in [0, 2 \pi )} \left\Vert S \left( \frac{ \sqrt{{\text {tr}}S^{2}} }{ \left\Vert S z \right\Vert _{2} } z - e^{i\theta } {x} \right) \right\Vert _{2}^{2} \le \frac{{\text {tr}}S^{2} }{ \left\Vert S z \right\Vert _{2}^{2} } \frac{2 z^{*} {L} z}{\lambda _2(S^{-1} L_{G} S^{-1})}, \end{aligned}$$
(31)

and

$$\begin{aligned} 2 z^{*} {L} z \le p_z^{2} x^{*} \hat{L} x = p_z^{2} \ \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}. \end{aligned}$$
(32)

Combined, they will grant us inequality (18). The first one is required to transit from the signs of z back to z using Inequality (23). The scaling factor guarantees that vector \(\frac{ \sqrt{{\text {tr}}S^{2}} }{ \left\Vert S z \right\Vert _{2} } Sz\) has same norm as \( \left\Vert S x \right\Vert _{2}\), analogously to \( \left\Vert z \right\Vert _{2}\) having same norm as \( \left\Vert x \right\Vert _{2}\) in proof of Theorem 4, which is crucial for the second inequality. It allows us to transit from reconstruction error to the true objective function. Another difference to the proof of Theorem 4 is the appearance of the matrix \(S^{-1} L_{G} S^{-1}\) instead of \(L_{G}\). This change is important, as it makes the nominator free of the scaling matrix S. The spectral gap of the matrix \(S^{-1} {L} S^{-1}\) is bounded from below by

$$\begin{aligned} \lambda _2( S^{-1} L S^{-1}) \ge \lambda _2(L) \lambda _1^{2} (S^{-1}) = \tau _{G} \lambda _n^{-2}(S) = \tau _{G} \left( \max _{\ell \in [d]} S_{\ell ,\ell } \right) ^{-2} > 0, \end{aligned}$$

which implies that the nullspace of \(S^{-1} {L} S^{-1}\) is spanned by Sx. The last inequality introduces a data dependent upper bound for the value of the true objective, which is precisely Inequality (20).

For the LSP bound (17), norm of \(S {\tilde{x}}\) is already \(\sqrt{{\text {tr}}S^{2}}\) and hence, Inequality (30) can be omitted. Moreover, since \( \left\Vert {\tilde{x}} \right\Vert _{\infty } = 1\), the penalty factor \(p_z^{2}\) in Inequality (32) is simplified to a constant 4. This results in

$$\begin{aligned} \min _{\theta \in [0, 2 \pi )} \left\Vert S \left( {\tilde{x}} - e^{i\theta } {x} \right) \right\Vert _{2}^{2} \le 2 \frac{{\tilde{x}}^{*} {L} {\tilde{x}}}{\lambda _2(S^{-1} L_{G} S^{-1})} \le 4 \frac{{x}^{*} \hat{L} {x}}{\lambda _2(S^{-1} L_{G} S^{-1})}. \end{aligned}$$
(33)

\(\square \)

Proof of Corollary 1

For an unweighted graph G we immediately get

$$\begin{aligned} \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2} = \sum _{ (\ell , j ) \in E} 1 \cdot | {X}_{\ell ,j} - \hat{X}_{\ell ,j} |^{2} = \left\Vert \hat{X} - {X} \right\Vert _{F}^{2}. \end{aligned}$$

\(\square \)

Proof of Corollary 2

Define an auxiliary weight matrix \(W_0\) by \((W_0)_{\ell ,j}=|Y_{\ell ,j}|\ (1-\delta _{\ell ,j})\). Using inequality (14) we obtain

$$\begin{aligned} \left\Vert W \circ (\hat{X} - {X}) \right\Vert _{F}&= \left\Vert W \circ \hat{X} - W \circ {X} \right\Vert _{F}\\&\le \left\Vert W \circ \hat{X} - W_0 \circ {X} \right\Vert _{F} + \left\Vert W_0 \circ {X} - W \circ {X} \right\Vert _{F}\\&\le \left\Vert \hat{Y} - Y \right\Vert _{F} + \left\Vert (W_0 - W) \circ {X} \right\Vert _{F}\\&= \left\Vert \hat{Y} - Y \right\Vert _{F} + \left\Vert W_0 - W \right\Vert _{F}\\&\le \left\Vert \hat{Y} - Y \right\Vert _{F} + \left\Vert \hat{Y} - Y \right\Vert _{F} = 2 \left\Vert \hat{Y} - {Y} \right\Vert _{F}, \end{aligned}$$

where in the third line we only increased the number of non-negative summands by adding the diagonal elements \(|\hat{Y}_{\ell ,\ell } - {Y}_{\ell ,\ell }|\), and in the last line we used the inequality \(||\alpha |-|\beta || \le |\alpha - \beta |\). \(\square \)

Proof of Corollary 3

We rewrite the right side of the bound in Theorem 4 as

$$\begin{aligned} \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}&= \sum _{ (\ell , j ) \in E} w_{\ell ,j} \ | {X}_{\ell ,j} - \hat{X}_{\ell ,j} |^{2} = \sum _{ (\ell , j ) \in E} |\hat{Y}_{\ell ,j}|^{2} \ | {X}_{\ell ,j} - \hat{X}_{\ell ,j} |^{2} \\&= \sum _{ (\ell , j ) \in E} |\hat{Y}_{\ell ,j}|^{2} \ | {\text {sgn}}( {Y}_{\ell ,j}) - \hat{X}_{\ell ,j} |^{2} \\&= \sum _{ (\ell , j ) \in E} |\hat{Y}_{\ell ,j}|^{2} \ \left| {\text {sgn}}\left( \frac{ {Y}_{\ell ,j}}{|\hat{Y}_{\ell ,j}|} \right) - \hat{X}_{\ell ,j} \right| ^{2}, \\ \end{aligned}$$

and apply the inequality (23) to get

$$\begin{aligned} \left\Vert R \circ (\hat{X} - {X}) \right\Vert _{F}^{2}&\le \sum _{ (\ell , j ) \in E} 4 |\hat{Y}_{\ell ,j}|^{2} \ \left| \frac{ {Y}_{\ell ,j}}{|\hat{Y}_{\ell ,j}|} - \hat{X}_{\ell ,j} \right| ^{2} \\&= \sum _{ (\ell , j ) \in E} 4 |\hat{Y}_{\ell ,j}|^{2} \ \left| \frac{ {Y}_{\ell ,j}}{|\hat{Y}_{\ell ,j}|} - \frac{\hat{Y}_{\ell ,j}}{|\hat{Y}_{\ell ,j}|}\right| ^{2} \\&= \sum _{ (\ell , j ) \in E} 4 \left| {Y}_{\ell ,j} - \hat{Y}_{\ell ,j} \right| ^{2} \le 4 \left\Vert {Y} - \hat{Y} \right\Vert _{F}^{2}.\\ \end{aligned}$$

\(\square \)

5 Numerical Evaluation

In this section we present a numerical comparison of the error bounds discussed above. Our goal is to illustrate that Theorem 4 indeed provides superior recovery guarantees for an important class of weighted angular synchronization problems, namely those appearing in the context of ptychography, as covered by Corollaries 2 and 3. In particular, we work with the edge set \(E_\delta \) as in (16), for some parameter \(\delta \in [ \lfloor (d+1)/2 \rfloor ] \), which determines the neighborhood of indices for which the pairwise phase differences are known.

In our numerical experiments, we consider measurements affected by angular noise, that is, the measurements are of the form (4), i.e.

$$\begin{aligned} \hat{X}_{\ell ,j}= {\left\{ \begin{array}{ll} e^{i (\varphi _{\ell } - \varphi _{j} + \eta _{\ell ,j}) }, &{} (\ell ,j) \in E_\delta ,\\ 0, &{} (\ell ,j) \notin E_\delta , \end{array}\right. } \end{aligned}$$

with the noise model that \(\eta _{\ell ,j}, (\ell ,j) \in E_\delta \) are independent random variables uniformly distributed on \([-\alpha ,\alpha ]\) for some parameter \(\alpha > 0\) representing the noise level. In the figures, we will denote parameter \(\alpha \) in degrees and not radians.

We consider signals \({y}\) drawn at random with coordinates \({y}_{\ell } = a_{\ell } + i b_{\ell }\), where \(a_{\ell }\) and \(b_{\ell }\) are independent identically distributed standard Gaussian random variables. We assume that the \(|y_{\ell }|\) are known, so the phases of the \({y}_{\ell }\) are our unknown ground truth entries \({x}_{\ell } = e^{i\varphi _{\ell }}\).

In most of the following examples, we fix the dimension to be \(d = 64\) and the parameter \(\delta =16\), so that approximately half of the pairwise phase differences are known. For each point in the figures we generated 30 test signals and plot the average norm of the error. All experiments were performed on the laptop running Windows 10 Pro with an Intel(R) Core(TM) i7-8550U processor, with 16 GB RAM and Matlab R2018b. Our ER implementation is based on the BlockPR software package [13]. For the SDP, we used SeDuMi solver available at [25].

We begin with the comparison of the recovery guarantees for the different weight matrices covered by Corollaries  12, and 3 in terms of the angular noise level \(\alpha \) measured in degrees. To put the bounds into perspective, we include the empirical error of both SDP and ER.

In a view of the fact that the coordinates of \(\tilde{x}\) and \({x}\) have modulus 1 and Inequality (27), a naïve bound for the phase error is given by

$$\begin{aligned} \min _{\theta \in [0, 2\pi )} \left\Vert \tilde{x} - e^{i\theta } {x} \right\Vert _{2}^{2} = \left\Vert \tilde{x} \right\Vert _{2}^{2} + \left\Vert {x} \right\Vert _{2}^{2} - 2 |\tilde{x}^{*} {x}| \le 2 d. \end{aligned}$$
(34)

Beyond this threshold, the error bounds provided by the statements are non-informative, which is why we indicate the threshold by a dashed black line in the plots.

Figure 1 shows the empirical performance of the SDP and the error bounds for the LSP. While for unweighted graphs Theorems 2 and Inequality (3.B) exhibit comparable behavior, for weighted graphs established Corollaries improve on results of Inequality (3.B). The crucial observation is that the SDP is not tight for weighted graphs even when the noise level is as small as \(10^{-3}\) degrees and only tight for unweighted when noise level is below few degrees. This implies that the error bounds for LSP are no longer applicable for the SDP.

Turning to the ER (Fig. 2), we observe a similar behavior. Recovery guarantees for ER provided by Corollary 1 and Theorem  5 are close in unweighted case and Corollaries 2 and  3 improves on bound provided by Theorem 5 for weighted graphs. As before, for noise above few degrees the supremum norm of the ER solution deviates from 1, what indicates that the relaxation is not tight. In contrast to the SDP, the established recovery guarantees for the ER remain applicable for all noise levels.

Fig. 1
figure 1

ac Comparison of the recovery guarantees and true errors of SDP and LSP for angular synchronization in the context of the ptychography problem, \(d = 64\), \(\delta =16\). d Rank of the SDP solution as a measure for the tightness of the relaxation

Fig. 2
figure 2

a-c Comparison of the recovery guarantees and true errors of ER for angular synchronization in the context of the ptychography problem, \(d = 64\), \(\delta =16\). d Supremum norm of the ER solution as a measure for the tightness of the relaxation

Fig. 3
figure 3

Comparison of the recovery guarantees and true errors of LSP, SDP and ER for angular synchronization in the context of the ptychography problem, \(d = 64\), \(\delta =16\)

Fig. 4
figure 4

Comparison of the recovery guarantees and true errors of LSP, SDP and ER for scaled norms, \(d = 64\), \(\delta =16\)

Both for unweighted and weighted graphs (Fig. 3), we observe that the empirical error performs similarly for both SDP and ER; there is no significant difference between the two methods in terms of the phase error. For the low and medium noise levels, the phase error rises linearly with the angular noise level. Only for very high noise it exhibits faster growth.

For all scenarios, the guarantees for ER more or less agree with the bounds for the least squares problem. This is remarkable because ER is faster than SDP (see Fig. 5a below). At the time, the empirical errors differ from our error bounds by a factor of roughly \(\delta \), which suggests that an additional tightening of the \(\delta \)-dependence may be possible using refined techniques.

In Fig. 4, we explore the recovery guarantees for scaled norms provided by Theorem 6. In these experiments, the scaling matrix S has magnitudes \(|y_{j}|\) on the main diagonal. Again, the empirical errors and recovery guarantees exhibit similar behavior to non-scaled errors as seen in Fig. 3, except that for high noise levels, recovery guarantees for ER variate from their LSP counterpart as a result of additional normalization factor appearing in Inequality (18).

In terms of the runtime complexity (Fig. 5a), ER exhibits almost linear scaling in the dimension of the problem and clearly outperforms SDP, whose runtime exhibits quadratic scaling. This difference is to be expected as SDP lifts the problem to a \(d \times d\)-dimensional matrix space and thus estimates \(d^{2}\) unknowns instead of d in the case of ER. In fact the large runtime complexity is a crucial bottleneck for SDP relaxations in ptychography, where the dimensions are commonly high.

The last simulation (Fig. 5b) illustrates how both the empirical error and our error bounds depend on the size of the mask in ptychography (which in turn is related to the connectivity of the graph). Again, up to constants, we observe a similar decay pattern for the error between theory and experiment with a fast decay for small \(\delta \) (up to \({\mathcal {O}}(\log d)\)) and slower decay for larger values of \(\delta \).

Fig. 5
figure 5

Performance of the different methods and setups

Fig. 6
figure 6

Performance of the different relaxations for all weight setups. Left: Random angular noise model used in the Sect. 5. Right: Angular synchronization as a part of the ptychographic reconstruction in [14], noise decreases as signal to noise ratio increases

6 Discussion and Future Work

The main focus of this paper is the eigenvector relaxation of the angular synchronization problem. We derived new flexible error bounds for this method. Along the way, we established new recovery guarantees for the solution of the weighted least squares problem (3). Our numerical evaluation shows that the recovery guarantees we obtain are tighter than other results in the literature. As compared to semidefinite programming, eigenvector relaxation shows similar performance in the terms of empirical error and has significantly shorter runtime; at the same time our recovery guarantees are not subject to additional constraint corresponding to the tightness of the relaxation as they appear for the semidefinite programming.

Our numerical experiments are based on the simple random angular noise model, which likely does not correspond to the noise arising in applications. In Fig. 6, we observe that while for the simplified noise model, unweighted angular synchronization seems most appropriate, for Gaussian noise applied directly to the phaseless measurements [14], weighted angular synchronization is the method of choice.

Another interesting direction of future work is to extend the generalized power method developed by Boumal in [4] to arbitrary sets E. Our current results can be considered as a first step, since the generalized power method uses the solution of the eigenvector relaxation problem as initialization, so good bounds on the error are crucial for estimating the quality of the initialization.