Classical variational simulation of the Quantum Approximate Optimization Algorithm

Medvidović, Matija; Carleo, Giuseppe

doi:10.1038/s41534-021-00440-z

Download PDF

Article
Open access
Published: 18 June 2021

Classical variational simulation of the Quantum Approximate Optimization Algorithm

npj Quantum Information volume 7, Article number: 101 (2021) Cite this article

14k Accesses
42 Citations
279 Altmetric
Metrics details

Subjects

Abstract

A key open question in quantum computing is whether quantum algorithms can potentially offer a significant advantage over classical algorithms for tasks of practical interest. Understanding the limits of classical computing in simulating quantum systems is an important component of addressing this question. We introduce a method to simulate layered quantum circuits consisting of parametrized gates, an architecture behind many variational quantum algorithms suitable for near-term quantum computers. A neural-network parametrization of the many-qubit wavefunction is used, focusing on states relevant for the Quantum Approximate Optimization Algorithm (QAOA). For the largest circuits simulated, we reach 54 qubits at 4 QAOA layers, approximately implementing 324 RZZ gates and 216 RX gates without requiring large-scale computational resources. For larger systems, our approach can be used to provide accurate QAOA simulations at previously unexplored parameter values and to benchmark the next generation of experiments in the Noisy Intermediate-Scale Quantum (NISQ) era.

Variational quantum algorithms

Article 12 August 2021

Quantum variational algorithms are swamped with traps

Article Open access 15 December 2022

Prospects for quantum enhancement with diabatic quantum annealing

Article 28 May 2021

Introduction

The past decade has seen a fast development of quantum technologies and the achievement of an unprecedented level of control in quantum hardware¹, clearing the way for demonstrations of quantum computing applications for practical uses. However, near-term applications face some of the limitations intrinsic to the current generation of quantum computers, often referred to as Noisy Intermediate-Scale Quantum (NISQ) hardware². In this regime, a limited qubit count and absence of quantum error correction constrain the kind of applications that can be successfully realized. Despite these limitations, hybrid classical-quantum algorithms^3,4,5,6 have been identified as the ideal candidates to assess the first possible advantage of quantum computing in practical applications^7,8,9,10.

The Quantum Approximate Optimization Algorithm (QAOA)⁵ is a notable example of variational quantum algorithm with prospects of quantum speedup on near-term devices. Devised to take advantage of quantum effects to solve combinatorial optimization problems, it has been extensively theoretically characterized^{11,12,13,14,15,16}, and also experimentally realized on state-of-the-art NISQ hardware¹⁷. While the general presence of quantum advantage in quantum optimization algorithms remains an open question^18,19,20,21, QAOA has gained popularity as a quantum hardware benchmark^22,23,24,25. As its desired output is essentially a classical state, the question arises whether a specialized classical algorithm can efficiently simulate it²⁶, at least near the variational optimum. In this paper, we use a variational parametrization of the many-qubit state based on Neural Network Quantum States (NQS)²⁷ and extend the method of ref. ²⁸ to simulate QAOA. This approach trades the need for exact brute force exponentially scaling classical simulation with an approximate, yet accurate, classical variational description of the quantum circuit. In turn, we obtain an heuristic classical method that can significantly expand the possibilities to simulate NISQ-era quantum optimization algorithms. We successfully simulate the Max-Cut QAOA circuit^5,11,17 for 54 qubits at depth p = 4 and use the method to perform a variational parameter sweep on a 1D cut of the parameter space. The method is contrasted with state-of-the-art classical simulations based on low-rank Clifford group decompositions²⁶, whose complexity is exponential in the number of non-Clifford gates as well as tensor-based approaches²⁹. Instead, limitations of the approach are discussed in terms of the QAOA parameter space and its relation to different initializations of the stochastic optimization method used in this work.

Results

The Quantum Approximate Optimization Algorithm

The Quantum Approximate Optimization Algorithm (QAOA) is a variational quantum algorithm for approximately solving discrete combinatorial optimization problems. Since its inception in the seminal work of Farhi, Goldstone, and Gutmann^5,12, QAOA has been applied to Maximum Cut (Max-Cut) problems. With competing classical algorithms³⁰ offering exact performance bounds for all graphs, an open question remains—can QAOA perform better by increasing the number of free parameters?

In this work, we study a quadratic cost function^31,32 associated with a Max-Cut problem. If we consider a graph G = (V, E) with edges E and vertices V, the Max-Cut of the graph G is defined by the following operator:

$${\mathcal{C}}=\mathop{\sum}\limits_{i,j\in E}{w}_{ij}{Z}_{i}{Z}_{j}\!,$$

(1)

where w_ij are the edge weights and Z_i are Pauli operators. The classical bitstring ${\mathcal{B}}$ that minimizes $\left\langle {\mathcal{B}}\right|{\mathcal{C}}\left|{\mathcal{B}}\right\rangle$ is the graph partition with the maximum cut. QAOA approximates such a quantum state through a quantum circuit of predefined depth p:

$$\left|{\boldsymbol{\gamma }},{\boldsymbol{\beta }}\right\rangle ={U}_{B}({\beta }_{p}){U}_{C}({\gamma }_{p})\cdots {U}_{B}({\beta }_{1}){U}_{C}({\gamma }_{1})\left|+\right\rangle \!,$$

(2)

where $\left|+\right\rangle$ is a symmetric superposition of all computational basis states: $\left|+\right\rangle ={H}^{\otimes N}{\left|0\right\rangle }^{\otimes N}$ for N qubits. The set of 2p real numbers γ_i and β_i for i = 1…p define the variational parameters to be optimized over by an external classical optimizer. The unitary gates defining the parametrized quantum circuit read ${U}_{B}(\beta )={\prod }_{i\in V}{e}^{-i\beta {X}_{i}}$ and ${U}_{C}(\gamma )={e}^{-i\gamma {\mathcal{C}}}$.

Optimal variational parameters γ and β are then found through an outer-loop classical optimizer of the following quantum expectation value:

$$C({\boldsymbol{\gamma }},{\boldsymbol{\beta }})=\left\langle {\boldsymbol{\gamma }},{\boldsymbol{\beta }}\right|{\mathcal{C}}\left|{\boldsymbol{\gamma }},{\boldsymbol{\beta }}\right\rangle$$

(3)

It is known that, for QAOA cost operators of the general form ${\mathcal{C}}={\sum }_{k}{{\mathcal{C}}}_{k}({Z}_{1},\ldots ,{Z}_{N})$, the optimal value asymptotically converges to the minimum value:

$$\mathop{\mathrm{lim}}\limits_{p\to \infty }{C}_{p}=\mathop{\min }\limits_{{\mathcal{B}}}\left\langle {\mathcal{B}}\right|{\mathcal{C}}\left|{\mathcal{B}}\right\rangle$$

(4)

where C_p is the optimal cost value at QAOA depth p and ${\mathcal{B}}$ are classical bit strings. With modern simulations and implementations still being restricted to lower p-values, it is unclear how large p has to get in practice before QAOA becomes comparable with its classical competition.

In this work we consider 3-regular graphs with all weights w_ij set to unity at QAOA depths of p = 1, 2, 4.

Classical variational simulation

Consider a quantum system consisting of N qubits. The Hilbert space is spanned by the computational basis $\{\left|{\mathcal{B}}\right\rangle :{\mathcal{B}}\in {\{0,1\}}^{N}\}$ of classical bit strings ${\mathcal{B}}=({B}_{1},\ldots ,{B}_{N})$. A general state can be expanded in this basis as $\left|\psi \right\rangle ={\sum }_{{\mathcal{B}}}\psi ({\mathcal{B}})\left|{\mathcal{B}}\right\rangle$. The convention ${Z}_{i}\left|{\mathcal{B}}\right\rangle ={(-1)}^{{B}_{i}}\left|{\mathcal{B}}\right\rangle$ is adopted. In order to perform approximate classical simulations of the QAOA quantum circuit, we use a neural-network representation of the many-body wavefunction $\psi ({\mathcal{B}})$ associated with this system, and specifically adopt a shallow network of the Restricted Boltzmann Machine (RBM) type^33,34,35:

$$\begin{array}{lll}&&\psi ({\mathcal{B}})\approx {\psi }_{\theta }({\mathcal{B}})\equiv \exp \left(\mathop{\sum }\limits_{j=1}^{N}{a}_{j}{B}_{j}\right) \cdot \mathop{\prod }\limits_{k=1}^{{N}_{\text{h}}}\left[1+\exp \left({b}_{k}+\mathop{\sum }\limits_{j=1}^{N}{W}_{jk}{B}_{j}\right)\right].\end{array}$$

(5)

The RBM provides a classical variational representation of the quantum state^27,36. It is parametrized by a set of complex parameters θ = {a, b, W}—visible biases a = (a₁, …, a_N), hidden biases ${\bf{b}}=({b}_{1},\ldots ,{b}_{{N}_{\text{h}}})$ and weights W = (W_j,k: j = 1…N, k = 1…N_h). The complex-valued ansatz given in Eq. (5) is, in general, not normalized.

We note that the N-qubit $\left|+\right\rangle$ state required for initializing QAOA can always be exactly implemented by setting all variational parameters to 0. That choice ensures that the wavefunction ansatz given in Eq. (5) is constant across all computational basis states, as required. The advantage of using the ansatz given in Eq. (5) as an N-qubit state is that a subset of one- and two-qubit gates can be exactly implemented as mappings between different sets of variational parameters $\theta \mapsto \theta ^{\prime}$. In general, such mapping corresponding to an abstract gate ${\mathcal{G}}$ is found as the solution of the following nonlinear equation:

$$\langle {\mathcal{B}}| {\psi }_{\theta ^{\prime} }\rangle =C\left\langle {\mathcal{B}}\right|{\mathcal{G}}\left|{\psi }_{\theta }\right\rangle,$$

(6)

for all bit strings ${\mathcal{B}}$ and any constant C, if a solution exists. For example, consider the Pauli Z gate acting on qubit i. In that case, Eq. (6) reads ${e}^{{a}_{i}^{\prime}{B}_{i}}=C{(-1)}^{{B}_{i}}{e}^{{a}_{i}{B}_{i}}$ after trivial simplification. The solution is ${a}_{i}^{\prime}={a}_{i}+i\pi$ for C = 1, with all other parameters remaining unchanged. In addition, one can exactly implement a subset of two-qubit gates by introducing an additional hidden unit coupled only to the two qubits in question. Labeling the new unit by c, we can implement the RZZ gate relevant for QAOA. The gate is given as $RZZ(\phi )={e}^{-i\phi {Z}_{i}{Z}_{j}}\propto \,\text{diag}\,(1,{e}^{i\phi },{e}^{i\phi },1)$ up to a global phase. The replacement rules read:

$$\begin{array}{lll}&&{W}_{ic}=-2{\mathcal{A}}(\phi ), \quad {W}_{jc}=2{\mathcal{A}}(\phi )\\ &&{a}_{i}\to {a}_{i}+{\mathcal{A}}(\phi ),\quad {a}_{j}\to {a}_{j}-{\mathcal{A}}(\phi ),\end{array}$$

(7)

where ${\mathcal{A}}(\phi )$ = Arccosh $\left({e}^{i\phi }\right)$ and C = 2. Derivations of replacement rules for these and other common one and two-qubit gates can be found in Sec. Methods.

Not all gates can be applied through solving Eq. (6). Most notably, gates that form superpositions belong in this category, including ${U}_{B}(\beta )={\prod }_{i}{e}^{-i\beta {X}_{i}}$ required for running QAOA. This happens simply because a linear combination of two or more RBMs cannot be exactly represented by a single new RBM through a simple variational parameter change. To simulate those gates, we employ a variational stochastic optimization scheme.

We take ${\mathcal{D}}(\phi ,\psi )=1-F(\phi ,\psi )$ as a measure of distance between two arbitrary quantum states $\left|\phi \right\rangle$ and $\left|\psi \right\rangle$, where F(ϕ, ψ) is the usual quantum fidelity:

$$F(\phi ,\psi )=\frac{| \langle \phi | \psi \rangle {| }^{2}}{\langle \phi | \phi \rangle \langle \psi | \psi \rangle}.$$

(8)

In order to find variational parameters θ, which approximate a target state $\left|\phi \right\rangle$ well ($\left|{\psi }_{\theta }\right\rangle \approx \left|\phi \right\rangle$, up to a normalization constant), we minimize ${\mathcal{D}}({\psi }_{\theta },\phi )$ using a gradient-based optimizer. In this work we use the Stochastic Reconfiguration (SR)^37,38,39 algorithm to achieve that goal.

For larger p, extra hidden units introduced when applying U_C(γ) at each layer can result in a large number of associated parameters to optimize over that are not strictly required for accurate output state approximations. So to keep the parameter count in check, we insert a model compression step, which halves the number of hidden units immediately after applying U_C doubles it. Specifically we create an RBM with fewer hidden units and fit it to the output distribution of the larger RBM (output of U_C). Exact circuit placement of compression steps are shown on Fig. 1 and details are provided in Methods. As a result of the compression step, we are able to keep the number of hidden units in our RBM ansatz constant, explicitly controlling the variational parameter count.

Simulation results for 20 qubits

In this section we present our simulation results for Max-Cut QAOA on random regular graphs of order N^40,41,42. In addition, we discuss model limitations and its relation to current state-of-the-art simulations.

QAOA angles γ, β are required as an input of our RBM-based simulator. At p = 1, we base our parameter choices on the position of global optimum that can be computed exactly (see Supplementary Note 1). For p > 1, we resort to direct numerical evaluation of the cost function as given in Eq. (1) from either the complete state vector of the system (number of qubits permitting) or from importance-sampling the output state as represented by a RBM. For all p, we find the optimal angles using Adam⁴³ with either exact gradients or their finite-difference approximations.

We begin by studying the performance of our approach on a 20-qubit system corresponding to the Max-Cut problem on a 3-regular graph of order N = 20. In that case, access to exact numerical wavefunctions is not yet severely restricted by the number of qubits. That makes it a suitable test-case. The results can be found in Fig. 2.

**Fig. 2: Benchmarking the cost function for 20 qubits.**

In Fig. 2, we present the cost function for several values of QAOA angles, as computed by the RBM-based simulator. Each panel shows cost functions from one typical random 3-regular graph instance. We observe that cost landscapes, optimal angles and algorithm performance do not change appreciably between different random graph instances. We can see that our approach reproduces variations in the cost landscape associated with different choices of QAOA angles at both p = 1 and p = 2. At p = 1, an exact formula (see Supplementary Note 1) is available for comparison of cost function values. We report that, at optimal angles, the overall final fidelity (overlap squared) is consistently above 94% for all random graph instances we simulate.

In addition to cost function values, we also benchmark our RBM-based approach by computing fidelities between our variational states and exact simulations. In Fig. 3 we show the dependence of fidelity on the number of qubits and circuit depth p. While, in general, it is hard to analytically predict the behavior of these fidelities, we nonetheless remark that with relatively small NQS we can already achieve fidelities in excess of 92% for all system sizes considered for exact benchmarks.

**Fig. 3: Benchmarking with exact fidelities.**

Simulation results for 54 qubits

Our approach can be readily extended to system sizes that are not easily amenable to exact classical simulation. To show this, in Fig. 4 we show the case of N = 54 qubits. This number of qubits corresponds, for example, to what implemented by Google’s Sycamore processor, while our approach shares no other implementation details with that specific platform. For the system of N = 54 qubits, we closely reproduce the exact error curve (see Supplementary Note 1) at p = 1, implementing 81 RZZ (e^−iγZ⊗Z) gates exactly and 54 RX (e^−iβX) gates approximately, using the described optimization method. We also perform simulations at p = 2 and p = 4 and obtain corresponding approximate QAOA cost function values.

At p = 4, we exactly implement 324 RZZ gates and approximately implement 216 RX gates. This circuit size and depth is such that there is no available experimental or numerically exact result to compare against. The accuracy of our approach can nonetheless be quantified using intermediate variational fidelity estimates. These fidelities are exactly the cost functions (see Sec. Methods) we optimize, separately for each qubit. In Fig. 4 (panel b) we show the optimal variational fidelities (see Eq. (8)) found when approximating the action of RX gates with the RBM wavefunction. At optimal γ₄ (minimum of p = 4 curve at Fig. 4, panel a), the lowest variational fidelity reached was above 98%, for a typical random graph instance shown at Fig. 4. As noted earlier, exact final states of 54-qubit systems are intractable so we are unable to report or estimate the full many-qubit fidelity benchmark results.

We remark that the stochastic optimization performance is sensitive to choices of QAOA angles away from optimum (see Fig. 4 right). In general, we report that the fidelity between the RBM state (Eq. (5)) and the exact N-qubit state (Eq. (2)) decreases as one departs from optimal by changing γ and β.

For larger values of QAOA angles, the associated optimization procedure is more difficult to perform, resulting in a lower fidelity (see the dark patch in Fig. 4, panel b). We find that optimal angles were always small enough not to be in the low-performance region. Therefore, this model is less accurate when studying QAOA states away from the variational optimum. However, even in regions with lowest fidelities, RBM-based QAOA states are able to approximate cost well, as can be seen in Figs. 2 and 4.

As an additional hint to the high quality of the variational approximation, we capture the QAOA approximation of the actual combinatorial optimum. A tight upper bound on that optimum was calculated to be C_opt = −69 for 54 qubits by directly optimizing an RBM to represent the ground state of the cost operator defined in Eq. (3).

Comparison with other methods

In modern sum-over-Cliffords/Metropolis simulators, computational complexity grows exponentially with the number of non-Clifford gates. With the RZZ gate being a non-Clifford operation, even our 20-qubit toy example, exactly implementing 60 RZZ gates at p = 2, is approaching the limit of what those simulators can do²⁶. In addition, that limit is greatly exceeded by the larger, 54-qubit system we study next, implementing 162 RZZ gates. State-of-the-art tensor-based approaches²⁹ have been used to simulate larger circuits but are ineffective in the case of nonplanar graphs.

Another very important tensor-based method is the Matrix Product State (MPS) variational representation of the many-qubit state. This is is a low-entanglement representation of quantum states, whose accuracy is controlled by the so-called bond dimension. Routinely adopted to simulate ground states of one-dimensional systems with high accuracy^44,45,46, extensions of this approach to simulate challenging circuits have also been recently put forward⁴⁷. In Fig. 5, our approach is compared with an MPS ansatz. We establish that for small systems, MPS provides reliable results with relatively small bond dimensions. For larger systems, however, our approach significantly outperforms MPS-based circuit simulation methods both in terms of memory requirements (fewer parameters) and overall runtime. This is to be expected in terms of entanglement capacity of MPS wavefunctions, that are not specifically optimized to handle non-one-dimensional interaction graphs, as in this specific case at hand.

**Fig. 5: Comparison with Matrix Product States.**

For a more direct comparison, we estimate the MPS bond dimension required for reaching RBM performance at p = 2 and 54 qubits to be ~10⁴ (see Fig. 5), amounting to ~10¹⁰ complex parameters (≈160 GB of storage) while our RBM approach uses ≈4500 parameters (≈70 kB of storage). In addition, we expect the MPS number of parameters to grow with depth p because of additional entanglement, while RBM sizes heuristically scale weakly (constant in our simulations) with p and can be controlled mid-simulation using our compression step. It should be noted that the output MPS bond dimension depends on the specific implementation of the MPS simulator, namely, qubit ordering and the number of “swap” gates applied to correct for the nonplanar nature of the underlying graph, and that a more efficient implementations might be found. However, determining the optimal implementation is itself a difficult problem and, given the entanglement of a generic circuit we simulate, it would likely produce a model with orders of magnitude more parameters than a RBM-based approach.

Discussion

In this work, we introduce a classical variational method for simulating QAOA, a hybrid quantum-classical approach for solving combinatorial optimizations with prospects of quantum speedup on near-term devices. We employ a self-contained approximate simulator based on NQS methods borrowed from many-body quantum physics, departing from the traditional exact simulations of this class of quantum circuits.

We successfully explore previously unreachable regions in the QAOA parameter space, owing to good performance of our method near optimal QAOA angles. Model limitations are discussed in terms of lower fidelities in quantum state reproduction away from said optimum. Because of such different area of applicability and relative low computational cost, the method is introduced as complementary to established numerical methods of classical simulation of quantum circuits.

Classical variational simulations of quantum algorithms provide a natural way to both benchmark and understand the limitations of near-future quantum hardware. On the algorithmic side, our approach can help answer a fundamentally open question in the field, namely whether QAOA can outperform classical optimization algorithms or quantum-inspired classical algorithms based on artificial neural networks^48,49,50.

Methods

Exact application of one-qubit Pauli gates

As mentioned in the main text, some one-qubit gates gates can be applied exactly to the RBM ansatz given in Eq. (5). Here we discuss the specific case of Pauli gates. Parameter replacement rules we use to directly apply one-qubit gates can be obtained by solving Eq. (6) given in the main text. Consider for example the Pauli X_i or NOT_i gate acting on qubit i. It can be applied by satisfying the following system of equations:

$$\begin{array}{lll}&&{\mathrm{ln}}\,C+{a}_{i}^{\prime}{B}_{i}=(1-{B}_{i}){a}_{i}\\ &&{b}_{k}^{\prime}+{B}_{i}{W}_{ik}^{\prime}={b}_{k}+(1-{B}_{i}){W}_{ik}.\end{array}$$

(9)

for B_i = 0, 1. The solution is:

$$\begin{array}{lll}&&{\mathrm{ln}}\,C={a}_{i};\quad {a}_{i}^{\prime}=-{a}_{i};\\ &&{b}_{k}^{\prime}={b}_{k}+{W}_{ik};\quad {W}_{ik}^{\prime}=-{W}_{ik},\end{array}$$

(10)

with all other parameters remaining unchanged.

A similar solution can be found for the Pauli Y gate:

$$\begin{array}{lll}&&{\mathrm{ln}}\,C={a}_{i}+\frac{i\pi }{2};\quad {a}_{i}^{\prime}=-{a}_{i}+i\pi ;\quad \\ &&{b}_{k}^{\prime}={b}_{k}+{W}_{ik} ; \quad {W}_{ik}^{\prime}=-{W}_{ik} ,\end{array}$$

(11)

with all other parameters remaining unchanged as well.

For the Pauli Z gate, as described in the main text, one needs to solve ${e}^{{a}_{i}^{\prime}{B}_{i}}={(-1)}^{{B}_{i}}{e}^{{a}_{i}{B}_{i}}$. The solution is simply

$${a}_{i}^{\prime}={a}_{i}+i\pi.$$

(12)

More generally, it is possible to apply exactly an arbitrary Z rotation gate, as given in matrix form as:

$$RZ(\varphi )={e}^{-i\frac{\varphi }{2}Z}\propto \left(\begin{array}{ll}1&0\\ 0&{e}^{i\varphi }\end{array}\right)$$

(13)

where the proportionality is up to a global phase factor. Similar to the Pauli Z_i gate, this gate can be implemented on qubit i by solving ${e}^{{a}_{i}^{\prime}{B}_{i}}={e}^{i\varphi {B}_{i}}{e}^{{a}_{i}{B}_{i}}$. The solution is simply:

$${a}_{i}^{\prime}={a}_{i}+i\varphi,$$

(14)

with all other parameters besides a_i remaining unchanged. This expression reduces to the Pauli Z gate replacement rules for φ = π as required.

Exact application of two-qubit gates

We apply two-qubit gates between qubits k andlby adding an additional hidden unit (labeled by c) to the RBM before solving Eq. (6) from the main text. The extra hidden unit couples only to qubits in question, leaving all previously existing parameters unchanged. In that special case, the equation reduces to

$${e}^{{{\Delta }}{a}_{k}{B}_{k}+{{\Delta }}{a}_{l}{B}_{l}}\left(1+{e}^{{W}_{kc}{B}_{k}+{W}_{lc}{B}_{l}}\right){\psi }_{\theta }({\mathcal{B}})=C\left\langle {\mathcal{B}}\right|{\mathcal{G}}\left|{\psi }_{\theta }\right\rangle.$$

(15)

An important two-qubit gate we can apply exactly are ZZ rotations. The gate RZZ is key for being able to implement the first step in the QAOA algorithm. The definition is:

$$RZZ(\varphi )={e}^{-i\frac{\varphi }{2}Z\otimes Z}\propto \left(\begin{array}{*{20}{l}}1&0&0&0\\ 0&{e}^{i\varphi }&0&0\\ 0&0&{e}^{i\varphi }&0\\ 0&0&0&1\end{array}\right)\ ,$$

(16)

where the proportionality factor is again a global phase. The related matrix element for a RZZ_kl gate between qubits k and l is $\left\langle {B}_{k}^{\prime}{B}_{l}^{\prime}\right|RZ{Z}_{kl}(\varphi )\left|{B}_{k}{B}_{l}\right\rangle ={e}^{i\varphi {B}_{k}\oplus {B}_{l}}$ where ⊕ stands for the classical exclusive or (XOR) operation. Then, one solution to Eq. (15) reads:

$$\begin{array}{lll}&&{W}_{ic}=-2{\mathcal{A}}(\varphi );\quad {W}_{jc}=2{\mathcal{A}}(\varphi )\\ &&{a}_{i}^{\prime}={a}_{i}+{\mathcal{A}}(\varphi );\quad {a}_{j}^{\prime}={a}_{j}-{\mathcal{A}}(\varphi ),\end{array}$$

(17)

where ${\mathcal{A}}(\varphi )$ = Arccosh $\left({e}^{i\varphi }\right)$ and C = 2.

Approximate gate application

Here we provide model details and show how to approximately apply quantum gates that cannot be implemented through methods described in sec. Exact application of one-qubit Pauli gates. In this work we use the Stochastic Reconfiguration (SR)³⁷ algorithm to approximately apply quantum gates to the RBM ansatz. To that end, we write the “infidelity” between our RBM ansatz and the target state ϕ, ${\mathcal{D}}({\psi }_{\theta },\phi )=1-F({\psi }_{\theta },\phi )$, as an expectation value of an effective hamiltonian operator ${H}_{\,\text{eff}\,}^{\phi }$:

$${\mathcal{D}}({\psi }_{\theta },\phi )=\frac{\left\langle {\psi }_{\theta }\right|{H}_{\,\text{eff}\,}^{\phi }\left|{\psi }_{\theta }\right\rangle }{\langle {\psi }_{\theta }| {\psi }_{\theta }\rangle }\ \to \ {H}_{\,\text{eff}\,}^{\phi }={\mathbb{1}}-\frac{\left|\phi \right\rangle \left\langle \phi \right|}{\langle \phi | \phi \rangle }$$

(18)

We call the hermitian operator given in Eq. (18) a “hamiltonian” only because the target quantum state $\left|\psi \right\rangle$ is encoded into it as the eigenstate corresponding to the smallest eigenvalue. Our optimization scheme focuses on finding small parameter updates Δ_k that locally approximate the action of the imaginary time evolution operator associated with ${H}_{\,\text{eff}\,}^{\phi }$, thus filtering out the target state:

$$\left|{\psi }_{\theta +{{\Delta }}}\right\rangle =C\ {e}^{-\eta H}\left|{\psi }_{\theta }\right\rangle,$$

(19)

where C is an arbitrary constant included because our variational states (Eq. (5), main text) are not normalized. Choosing both η and Δ to be small, one can expand both sides to linear order in those variables and solve the resulting linear system for all components of Δ, after eliminating C first. After some simplification, one arrives at the following parameter at each loop iteration (indexed with t):

$${\theta }_{k}^{(t+1)}={\theta }_{k}^{(t)}-\eta \mathop{\sum}\limits_{l}{S}_{kl}^{-1}\ \frac{\partial {\mathcal{D}}}{\partial {\theta }_{l}^{* }},$$

(20)

where stochastic estimations of gradients of the cost function ${\mathcal{D}}({\psi }_{\theta },\phi )$ can be obtained through samples from ∣ψ_θ∣² at each loop iteration through:

$$\frac{\partial {\mathcal{D}}}{\partial {\theta }_{k}^{* }}={\left\langle {{\mathcal{O}}}_{k}^{\dagger }{H}_{\text{eff}}^{\phi }\right\rangle }_{{\psi }_{\theta }}-{\left\langle {{\mathcal{O}}}_{k}^{\dagger }\right\rangle }_{{\psi }_{\theta }}{\left\langle {H}_{\text{eff}}^{\phi }\right\rangle }_{{\psi }_{\theta }}.$$

(21)

Here, ${{\mathcal{O}}}_{k}$ is defined as a diagonal operator in the computational basis such that $\left\langle {\mathcal{B}}\right|^{\prime} {{\mathcal{O}}}_{k}\left|{\mathcal{B}}\right\rangle =\frac{\partial {\mathrm{ln}}\,{\psi }_{\theta }}{\partial {\theta }_{k}}\ {\delta }_{{\mathcal{B}}^{\prime} {\mathcal{B}}}$. Averages over ψ are commonly defined as ${\langle \cdot \rangle }_{\psi }\equiv \left\langle \psi \right|\cdot \left|\psi \right\rangle /\langle \psi | \psi \rangle$. Furthermore, the S-matrix appearing in Eq. (20) reads:

$${S}_{kl}={\left\langle {{\mathcal{O}}}_{k}^{\dagger }{{\mathcal{O}}}_{l}\right\rangle }_{{\psi }_{\theta }}-{\left\langle {{\mathcal{O}}}_{k}^{\dagger }\right\rangle }_{{\psi }_{\theta }}\left\langle \right.{{\mathcal{O}}}_{l}{\rangle }_{{\psi }_{\theta }},$$

(22)

and corresponds to the Quantum Geometric Tensor or Quantum Fisher Information (also see ref. ⁵¹ for a detailed description and connection with the natural gradient method in classical machine learning⁵²).

Exact computations of averages over N-qubit states ψ_θ and ϕ at each optimization step range from impractical to intractable, even for moderate N. Therefore, we evaluate those averages by importance-sampling the probability distributions associated with the variational ansatz ∣ψ_θ∣² and the target state ∣ϕ∣² at each optimization step t. All of the above expectation values are evaluated using Markov Chain Monte Carlo (MCMC)^38,39 sampling with basic single-spin flip local updates. An overview of the sampling method can be found in ref. ⁵³. In order to use those techniques, we rewrite Eq. (21) as:

$$\frac{\partial {\mathcal{D}}}{\partial {\theta }_{l}^{* }}={\left\langle \frac{\phi }{{\psi }_{\theta }}\right\rangle }_{{\psi }_{\theta }}{\left\langle \frac{{\psi }_{\theta }}{\phi }\right\rangle }_{\phi }\left[{\left\langle {{\mathcal{O}}}_{k}^{* }\right\rangle }_{{\psi }_{\theta }}-\frac{{\left\langle \frac{\phi }{{\psi }_{\theta }}{{\mathcal{O}}}_{k}^{* }\right\rangle }_{{\psi }_{\theta }}}{{\left\langle \frac{\phi }{{\psi }_{\theta }}\right\rangle }_{{\psi }_{\theta }}}\right].$$

(23)

In our experiments with less than 20 qubits, we take 8000 MCMC samples from four independent chains (totaling 32,000 samples) for gradient evaluation. Between each two recorded samples, we take N MCMC steps (for N qubits). For the 54-qubit experiment, we take 2000 MCMC samples four independent chains because of increased computational difficulty of sampling. The entire Eq. (23) is manifestly invariant to rescaling of ψ_θ and ϕ, removing the need to ever compute normalization constants. We remark that the prefactor in Eq. (23) is identically equal to the fidelity given in Eq. (8) in the main text.

$$F(\psi ,\phi )=\frac{| \langle \phi | \psi \rangle {| }^{2}}{\langle \phi | \phi \rangle \langle \psi | \psi \rangle }={\left\langle \frac{\phi }{\psi }\right\rangle }_{\psi }{\left\langle \frac{\psi }{\phi }\right\rangle }_{\phi }\ ,$$

(24)

allowing us to keep track of cost function values during optimization with no additional computational cost.

The second step consists of multiplying the variational derivative with the inverse of the S-matrix (Eq. (22)) corresponding to a stochastic estimation of a metric tensor on the hermitian parameter manifold. Thereby, the usual gradient is transformed into the natural gradient on that manifold. However, the S-matrix is stochastically estimated and it can happen that it is singular. To regularize it, we replace S with S + $\epsilon {\mathbb{1}}$, ensuring that the resulting linear system has a unique solution. We choose ϵ = 10⁻³ throughout. The optimization procedure is summarized in Supplementary Note 2.

In order to keep the number of hidden units reasonable, we employ a compression step at each QAOA layer (after the first). Immediately after applying the U_C(γ_k) gate in layer k to the RBM ψ_θ (and thereby introducing unwanted parameters), we go through the following steps:

(1)
Construct a new RBM ${\widetilde{\psi }}_{\theta }$.
(2)
Initialize ${\widetilde{\psi }}_{\theta }$ to exactly represent the state ${U}_{C}\left(\frac{1}{k}{\sum }_{j\le k}{\gamma }_{j}\right)\left|+\right\rangle$. Doing this introduces half the number hidden units that are already present in ψ_θ.
(3)
Stochastically optimize ${\widetilde{\psi }}_{\theta }$ to approximate ψ_θ (using algorithm in Supplementary Note 2) with ϕ → ψ_θ and $\psi \to {\widetilde{\psi }}_{\theta }$.

In essence, we use the optimization algorithm with the “larger” ψ_θ as the target state ϕ. The optimization results in a new RBM state with fewer hidden units that closely approximates the old RBM with fidelity > 0.98 in all our tests. We then proceed to simulate the rest of the QAOA circuit and apply the same compression procedure again when the number of parameters increases again. The exact schedule of applying this procedure in the context of different QAOA layers can be seen on Fig. 1.

We choose the initial state for the optimization as an exactly reproducible RBM state that has non-zero overlap with the target (larger) RBM. In principle, any other such state would work, but we heuristically find this one to be a reliable choice across all p-values studied. Alternatively, one can just initialize $\widetilde{{\psi }_{\theta }}$ to ${U}_{C}\left(\gamma ^{\prime} \right)\left|+\right\rangle$ with $\gamma ^{\prime} ={\text{argmax}}_{\gamma }\ F\left({\psi }_{\theta },{U}_{C}(\gamma )\left|+\right\rangle \right)$, using an efficient 1D optimizer to solve for $\gamma ^{\prime}$ before starting to optimize the full RBM.

Data availability

The authors declare that the data supporting the findings of this study are available within the paper.

Code availability

Our Python code is available on GitHub to reproduce the results presented in this paper through the following URL: github.com/Matematija/QubitRBM.

References

Arute, F. et al. Quantum supremacy using a programmable superconducting processor. Nature 574, 505–510 (2019).
Article ADS Google Scholar
Preskill, J. Quantum computing in the NISQ era and beyond. Quantum 2, 79 (2018).
Article Google Scholar
Peruzzo, A. et al. A variational eigenvalue solver on a photonic quantum processor. Nat. Commun. 5, 1–7 (2014).
Article Google Scholar
Farhi, E. & Neven, H. Classification with quantum neural networks on near term processors. Preprint at https://arxiv.org/abs/1802.06002 (2018).
Farhi, E., Goldstone, J. & Gutmann, S. A Quantum Approximate Optimization Algorithm. Preprint at https://arxiv.org/abs/1411.4028 (2014).
Grant, E. et al. Hierarchical quantum classifiers. npj Quantum Inf. 4, 1–8 (2018).
Article Google Scholar
Aspuru-Guzik, A., Dutoi, A. D., Love, P. J. & Head-Gordon, M. Chemistry: simulated quantum computation of molecular energies. Science 309, 1704–1707 (2005).
Article ADS Google Scholar
O’Malley, P. J. et al. Scalable quantum simulation of molecular energies. Phys. Rev. X 6, 031007 (2016).
Google Scholar
Biamonte, J. et al. Quantum machine learning. Nature 549, 195–202 (2017).
Article ADS Google Scholar
Lloyd, S. Universal quantum simulators. Science 273, 1073–1078 (1996).
Article MathSciNet MATH ADS Google Scholar
Wang, Z., Hadfield, S., Jiang, Z. & Rieffel, E. G. Quantum approximate optimization algorithm for MaxCut: a fermionic view. Phys. Rev. A 97, 022304 (2018).
Article ADS Google Scholar
Farhi, E., Goldstone, J. & Gutmann, S. A Quantum Approximate Optimization Algorithm applied to a bounded occurrence constraint problem. Preprint at https://arxiv.org/abs/1412.6062 (2014).
Lloyd, S. Quantum approximate optimization is computationally universal. Preprint at https://arxiv.org/abs/1812.11075 (2018).
Jiang, Z., Rieffel, E. G. & Wang, Z. Near-optimal quantum circuit for Grover’s unstructured search using a transverse field. Phys. Rev. A 95, 062317 (2017).
Article ADS Google Scholar
Hadfield, S. et al. From the quantum approximate optimization algorithm to a quantum alternating operator ansatz. Algorithms 12, 34 (2019).
Article MathSciNet MATH Google Scholar
Zhou, L., Wang, S. T., Choi, S., Pichler, H. & Lukin, M. D. Quantum Approximate Optimization Algorithm: performance, mechanism, and implementation on near-term devices. Phys. Rev. X 10, 21067 (2020).
Google Scholar
Harrigan, M. P. et al. Quantum approximate optimization of non-planar graph problems on a planar superconducting processor. Nat. Phys. 17, 332–336 (2021).
Article Google Scholar
Santoro, G. E., Martoňák, R., Tosatti, E. & Car, R. Theory of quantum annealing of an Ising spin glass. Science 295, 2427–2430 (2002).
Article ADS Google Scholar
Rønnow, T. F. et al. Defining and detecting quantum speedup. Science 345, 420–424 (2014).
Article ADS Google Scholar
Guerreschi, G. G. & Matsuura, A. Y. QAOA for Max-Cut requires hundreds of qubits for quantum speed-up. Sci. Rep. 9, 1–7 (2019).
Article Google Scholar
Bravyi, S., Kliesch, A., Koenig, R. & Tang, E. Obstacles to variational quantum optimization from symmetry protection. Phys. Rev. Lett. 125, 260505 (2020).
Article MathSciNet ADS Google Scholar
Pagano, G. et al. Quantum approximate optimization of the long-range Ising model with a trapped-ion quantum simulator. Proc. Natl Acad. Sci. USA 117, 25396–25401 (2020).
Article ADS Google Scholar
Bengtsson, A. et al. Improved success probability with greater circuit depth for the Quantum Approximate Optimization Algorithm. Phys. Rev. Appl. 14, 034010 (2020).
Willsch, M., Willsch, D., Jin, F., De Raedt, H. & Michielsen, K. Benchmarking the quantum approximate optimization algorithm. Quantum Inf. Process. 19, 1–24 (2020).
Article MathSciNet Google Scholar
Otterbach, J. S. et al. Unsupervised machine learning on a hybrid quantum computer. Preprint at https://arxiv.org/abs/1712.05771 (2017).
Bravyi, S. et al. Simulation of quantum circuits by low-rank stabilizer decompositions. Quantum 3, 181 (2019).
Article Google Scholar
Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602–606 (2017).
Article MathSciNet MATH ADS Google Scholar
Jónsson, B., Bauer, B. & Carleo, G. Neural-network states for the classical simulation of quantum computing. Preprint at https://arxiv.org/abs/1808.05232 (2018).
Villalonga, B. et al. Establishing the quantum supremacy frontier with a 281 Pflop/s simulation. Quantum Sci. Technol. 5, 034003 (2020).
Goemans, M. X. & Williamson, D. P. Improved approximation algorithms for maximum cut and satisflability problems using semidefinite programming. J. ACM 42, 1115–1145 (1995).
Article MATH Google Scholar
Lucas, A. Ising formulations of many NP problems. Front. Phys. 2, 1–14 (2014).
Article Google Scholar
Barahona, F. On the computational complexity of ising spin glass models. J. Phys. A Math. Gen. 15, 3241–3253 (1982).
Article MathSciNet ADS Google Scholar
Hinton, G. E. Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1771–1800 (2002).
Article MATH Google Scholar
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
Article MathSciNet MATH ADS Google Scholar
Lecun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS Google Scholar
Melko, R. G., Carleo, G., Carrasquilla, J. & Cirac, J. I. Restricted Boltzmann machines in quantum physics. Nat. Phys. 15, 887–892 (2019).
Article Google Scholar
Sorella, S. Green function monte carlo with stochastic reconfiguration. Phys. Rev. Lett. 80, 4558–4561 (1998).
Article ADS Google Scholar
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1092 (1953).
Article MATH ADS Google Scholar
Hastings, W. K. Monte carlo sampling methods using Markov chains and their applications. Biometrika 57, 97–109 (1970).
Article MathSciNet MATH Google Scholar
Steger, A. & Wormald, N. C. Generating random regular graphs quickly. Comb. Probab. Comput. 8, 377–396 (1999).
Article MathSciNet MATH Google Scholar
Kim, J. H. & Vu, V. H. Generating random regular graphs. in Proc. of the 35th annual ACM symposium on Theory of computing 213–222 (Association for Computing Machinery, 2003).
Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and function using NetworkX. in 7th Python Sci. Conf. (SciPy 2008), 11–15 (Pasadena, CA USA, 2008) https://networkx.org/documentation/stable/citing.html.
Kingma, D. P. & Ba, J. L. Adam: a method for stochastic optimization. in 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc. (San Diego, CA, USA, 2015) https://dblp.org/db/conf/iclr/iclr2015.html.
White, S. R. Density matrix formulation for quantum renormalization groups. Phys. Rev. Lett. 69, 2863–2866 (1992).
Article ADS Google Scholar
Vidal, G. Efficient classical simulation of slightly entangled quantum computations. Phys. Rev. Lett. 91, 147902 (2003).
Article ADS Google Scholar
Vidal, G. Efficient simulation of one-dimensional quantum many-body systems. Phys. Rev. Lett. 93, 040502 (2004).
Article ADS Google Scholar
Zhou, Y., Stoudenmire, E. M. & Waintal, X. What Limits the Simulation of Quantum Computers? Phys. Rev. X 10, 041038 (2020).
Gomes, J., Eastman, P., McKiernan, K. A. & Pande, V. S. Classical quantum optimization with neural network quantum states. Preprint at https://arxiv.org/abs/1910.10675 (2019).
Zhao, T., Carleo, G., Stokes, J. & Veerapaneni, S. Natural evolution strategies and variational Monte Carlo. Mach. Learn. Sci. Technol. 2, 2–3 (2020).
Google Scholar
Hibat-Allah, M., Inack, E. M., Wiersema, R., Melko, R. G. & Carrasquilla, J. Variational Neural Annealing. Preprint at https://arxiv.org/abs/2101.10154 (2021).
Stokes, J., Izaac, J., Killoran, N. & Carleo, G. Quantum natural gradient. Quantum 4, 269 (2020).
Article Google Scholar
Amari, S. I. Natural gradient works efficiently in learning. Neural Comput. 10, 251–276 (1998).
Article Google Scholar
Newman, M. E. J. & Barkema, G. T.Monte Carlo Methods in Statistical Physics (Oxford University Press, 1999).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Article ADS Google Scholar
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
Article Google Scholar
Gidney, C., Bacon, D. & The Cirq Developers. quantumlib/Cirq: A python framework for creating, editing, and invoking Noisy Intermediate Scale Quantum (NISQ) circuits. https://github.com/quantumlib/Cirq (2018).
Torlai, G. & Fishman, M. PastaQ.jl: Package for Simulation, Tomography and Analysis of Quantum Computers. https://github.com/GTorlai/PastaQ.jl (2020).
Fishman, M., White, S. R. & Stoudenmire, E. M. The ITensor software library for tensor network calculations. Preprint at https://arxiv.org/abs/2007.14822 (2020).
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 99–104 (2007).
Article Google Scholar

Download references

Acknowledgements

We thank S. Bravyi for enlightening discussions and M. Fishman for insights into MPS simulations. Numerical simulations were performed using NumPy⁵⁴, SciPy⁵⁵, Google Cirq⁵⁶, and PastaQ^57,58 for MPS simulations. Random graph generation was done with NetworkX^40,42. Plots were generated using Matplotlib⁵⁹. M.M. acknowledges support from the CCQ graduate fellowship in computational quantum physics. The Flatiron Institute is a division of the Simons Foundation.

Author information

Authors and Affiliations

Center for Computational Quantum Physics, Flatiron Institute, New York, NY, USA
Matija Medvidović
Department of Physics, Columbia University, New York, NY, USA
Matija Medvidović
Institute of Physics, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Giuseppe Carleo

Authors

Matija Medvidović
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Carleo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.C. conceived the main idea and co-wrote the manuscript. M.M. developed the idea further, wrote the computer code, executed the numerical simulations. and co-wrote the manuscript.

Corresponding author

Correspondence to Giuseppe Carleo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Medvidović, M., Carleo, G. Classical variational simulation of the Quantum Approximate Optimization Algorithm. npj Quantum Inf 7, 101 (2021). https://doi.org/10.1038/s41534-021-00440-z

Download citation

Received: 27 November 2020
Accepted: 13 May 2021
Published: 18 June 2021
DOI: https://doi.org/10.1038/s41534-021-00440-z

This article is cited by

Quantum harmonic oscillator model for simulation of intercity population mobility
- Xu Hu
- Lingxin Qian
- Zhaoyuan Yu
Journal of Geographical Sciences (2024)
HASM quantum machine learning
- Tianxiang Yue
- Chenchen Wu
- Wenjiao Shi
Science China Earth Sciences (2023)
A review on quantum computing and deep learning algorithms and their applications
- Fevrier Valdez
- Patricia Melin
Soft Computing (2023)
Solving MaxCut with quantum imaginary time evolution
- Rizwanul Alam
- George Siopsis
- Travis S. Humble
Quantum Information Processing (2023)
Empirical performance bounds for quantum approximate optimization
- Phillip C. Lotshaw
- Travis S. Humble
- George Siopsis
Quantum Information Processing (2021)