L-Sweeps: A scalable, parallel preconditioner for the high-frequency Helmholtz equation

https://doi.org/10.1016/j.jcp.2020.109706Get rights and content

Highlights

  • First fast parallel solver for the high-frequency Helmholtz equation.

  • Optimal O(N/p) parallel scaling, up to logarithmic factors.

  • The scaling holds in a distributed memory environment.

  • Sweeping preconditioners on a checkerboard domain decomposition.

  • Numerical examples in 2D and 3D (including practical benchmark wave speeds) are considered and corroborate all claims.

Abstract

We present the first fast solver for the high-frequency Helmholtz equation that scales optimally in parallel for a single right-hand side. The L-sweeps approach achieves this scalability by departing from the usual propagation pattern, in which information flows in a 180 degree cone from interfaces in a layered decomposition. Instead, with L-sweeps, information propagates in 90 cones induced by a Cartesian domain decomposition (CDD). We extend the notion of accurate transmission conditions to CDDs and introduce a new sweeping strategy to efficiently track the wave fronts as they propagate through the CDD. The new approach decouples the subdomains at each wave front, so that they can be processed in parallel, resulting in better parallel scalability than previously demonstrated in the literature. The method has an overall O((N/p)logω) empirical run-time for N=nd total degrees-of-freedom in a d-dimensional problem, frequency ω, and p=O(n) processors. We introduce the algorithm and provide a complexity analysis for our parallel implementation of the solver. We corroborate all claims in several two- and three-dimensional numerical examples involving constant, smooth, and discontinuous wave speeds.

Introduction

In this manuscript we consider the Helmholtz equation with variable wave speed and constant density on an open domain Ωbulk with absorbing boundary conditions,Δuω2mu=fin Ωbulk,+ A.B.C on Ωbulk, where for xΩbulk, m(x)=1/c2(x) is the squared slowness for the p-wave speed c, u is the solution wavefield, ω is the characteristic frequency, and f is the source density. For simplicity, we consider Ωbulk to be a square or a cube. While we consider only problems of this form in the ensuing developments, the method we propose is a framework that can be applied to other time-harmonic formulations that model more complex physics with ease.

The Helmholtz equation arises in modeling many physical phenomena, including electromagnetic and subsurface wave propagation, and it is the cornerstone of many modern algorithmic pipelines of inverse problems, especially in exploration geophysics [87], [68]. Within those pipelines, the finest recoverable detail is determined by the highest frequency for which (1) can be solved. Thus, the efficient solution of time-harmonic wave equations at high-frequency is extremely important in scientific and industrial applications. Consequently, we consider discretizations of (1) in the high-frequency regime, which means that the characteristic mesh-size of the spatial discretization has to scale as 1/ω.

Solving the high-frequency Helmholtz equation is notoriously difficult, mainly due to two issues:

  • the accuracy of the numerical solution deteriorates as the frequency increases. To solve this issue, usually called pollution error, one needs to either over-sample the discretization, or to use specialized discretization schemes [44], [45], [60],

  • the spectrum of the resulting matrix deteriorates as the frequency increases [61], [74], thus rendering the numerical solution of the linear system prohibitively expensive for large frequencies.

These issues are a primary driver of current research in this area [29], [58], [82]. In this manuscript we focus on the second issue and, for clarity and conciseness in the development, we present our new method using a simple low-order finite difference discretization. However, to address the first issue, our method can easily be extended to discretizations that mitigate pollution error1 [30], [82], [93] increasing the overall cost only by polylogarithmic factors.

A number of methods for solving the resulting linear systems are available, including direct methods (e.g., [88]), domain decomposition methods (e.g., [84] ), and preconditioned iterative methods, including multigrid and shifted-Laplacian methods (e.g., [72], [39] ). We consider a solver or a preconditioner to be sequentially scalable if, up to logarithmic factors, it can compute a solution in O(N) run-time in a sequential computational environment, where N is the total number of degrees-of-freedom in the discrete problem. We consider a solver or a preconditioner to be parallel scalable if, up to logarithmic factors, it can compute a solution in O(N/p) run-time in a parallel computational environment with p=O(n) processors for a single right-hand side. Here, n is the number of degrees of freedom in one spatial direction, i.e., n=O(N1d) where d is the problem dimension. In this paper, we present what we believe to be the first parallel scalable preconditioner for the high-frequency Helmholtz equation.

Currently, no scalable direct method, sequential or parallel, is available for the high-frequency regime. Standard domain decomposition methods (DDMs) localize the problem to subdomains and transfer information between subdomains. Domain decomposition methods can be applied as a direct solver (e.g., in the context of optimal Schwarz methods [35], [33]), but the resulting solver is not scalable. Domain decomposition methods can yield scalable preconditioners, both in parallel and sequentially. However, the resulting preconditioned iterative solver is not scalable because it requires O(ω) iterations. This shows an important aspect for the construction of solvers based on preconditioning: a quality preconditioner has to be scalable but also the resulting iterative method has to converge with limited growth in iterations as ω increases. In this regard, we call a preconditioner effective if the resulting preconditioned system can be solved in O(logω) iterations. Classical DDMs exhibit sub-optimal behavior for two reasons. First, artificial and spurious reflections are induced by imprecise information transfer between subdomains. Second, long-range wave-material interactions are not tracked consistently.

Sweeping preconditioners were introduced [26], [86], [16], [91], [32] to alleviate these drawbacks, while preserving the advantages of classical DDMs. Sweeping preconditioners make use of layered domain decompositions, accurate transmission conditions, and a layer-by-layer sweeping strategy. In particular, the layered domain decomposition provides scalability by controlling growth in computational cost and memory footprint. The accurate transmission conditions allow information to flow between the subdomains without numerical artifacts, e.g., artificial reflections. Finally, the sweeping strategy consistently tracks and propagates long-range wave-material interactions. The resulting approach can be interpreted as an approximate block-LU factorization, where the blocks correspond to the local problem in each subdomain. In particular, using sparse direct solvers on blocks that arise from sufficiently thin layers yields a method with quasi-linear (i.e., linear with poly-logarithmic factors) sequential complexity for a single right-hand side [26], [25], [92], [55], [77], [16], [86], [59]. In the presence of many right-hand sides, the layered domain decomposition allows for optimal parallelization via pipelining [95], [86].

Efforts to improve the performance of sweeping preconditioners have focused on obtaining better and more accurate factorizations [86], reducing the overall cost by restricting the unknowns to the interfaces [91], [92], and accelerating the computation on local subproblems (blocks in the approximate factorization) with compression or parallelization [67]. Several approaches aim specifically to sparsify those blocks, thus decreasing the sequential costs [55], [92]. However, while leveraging parallelism to accelerate the solve for local blocks decreases the run-times, it does not result in a parallel scalable solver for a single right-hand side.

The main bottleneck of current sweeping preconditioners is a lack of parallel scalability for a single right-hand side. The difficulty arises due to the inherently sequential nature of the Helmholtz problem, which is independent of the domain decomposition strategy. This sequential nature is dictated by the hyperbolic behavior of wave equations and is manifested in the need to accurately resolve the long-range interactions. The key to accurately resolving these interactions is a consistent information transfer between subdomains, which has precluded more general domain decompositions (i.e., beyond layered subdomains) and consequently inhibited parallelization.

In this work, we address these issues by departing from standard layered domain decompositions and introducing a Cartesian domain decomposition (CDD) along with a new sweeping strategy that is only viable on such domain decompositions. Our novel sweeping preconditioner consistently and efficiently tracks the wavefield across subdomains. The preconditioner can be interpreted as an approximate LU factorization, where parallelism arises because the diagonal blocks themselves have a block-diagonal structure. For a single right-hand side, the resulting algorithm has a O(N/p) run-time, up to logarithmic factors, where, independent of geometric dimension, p=O(n) is the number of processors. The new algorithm therefore provides the first parallel scalable solver for the Helmholtz equation at high-frequency for a single right-hand side.

Our approach can be applied to two- and three-dimensional problems. In two dimensions, the algorithm exhibits good weak parallel scalability. That is, as the frequency increases, and consequently the problem is refined, we refine the CDD so that the local problems in each subdomain have constant size. In three dimensions, we apply the same strategy as for the two-dimensional case in two of the spatial dimensions and extend the subdomains along the third spatial dimension resulting in beam-shaped quasi-one-dimensional local problems. In either case, we employ off-the-shelf direct solvers to solve the local problems and obtain parallel complexities of O(nlogω) and O(n2logω) in two and three dimensions, respectively. The 3D complexity can be reduced to O(nlogω) by employing existing parallel direct solvers [69], [52] for the quasi-one-dimensional problems in each subdomain, but we do not exploit these tools here.

The new algorithm therefore results in an O(nlogω) complexity for a single right-hand side, regardless of the geometric dimension. This new algorithm requires sweeps across the CDD in both cardinal and diagonal directions, and we can exploit parallelism in all directions orthogonal to the current sweep direction. While this new strategy increases parallelization, the sweeps are still inherently sequential and cannot be parallelized. Thus, we do not anticipate further reduction in complexity for a single right-hand, due exclusively to modification of the domain decomposition. In the presence of O(n) right-hand-sides, it has been shown, however, that the sequential nature of the sweeps can be mitigated by pipelining multiple right-hand sides [95]. Using pipelining, solutions for all O(n) right-hand-sides can be computed with O(nlogω) parallel complexity, i.e., the average parallel complexity per right-hand-side is O(logω).

As is customary in the development of new preconditioners or solution strategies, we focus on the parallel scalability of the algorithm and we assume that the accuracy of the solution stems from the chosen discretization. However, we give further exposition on achieving a parallel scalable and accurate algorithm in Section 6.

Our method is inspired by the method of polarized traces, which is well-established in the literature and, in contrast to other proposed preconditioners, has been proven applicable for problems with high-order discretizations [93], [78] and highly heterogeneous (and even discontinuous) wave speed distributions [92].

Standard linear algebra techniques such as nested dissection [40] and multi-frontal solvers [24], [89], coupled with H-matrices [9], have been applied to the Helmholtz problem [42], [19], [88], [1]. While these methods take advantage of compressed linear algebra to gain more efficiency (e.g., [6]), in the high-frequency regime they still suffer from the same sub-optimal asymptotic complexity as standard multi-frontal methods (e.g., [22], [2], [18]).

Multigrid methods (e.g., [50], [13], [28], [72], [62], [28], [3]), once thought to be inefficient for the Helmholtz problem, have been successfully applied [14], [47], [76]. Approaches stemming from the complex-shifted Laplacian [28] can be advantageous if properly tuned. However, in general, they either require an expensive solver for the shifted problem or require a large number of iterations to reach convergence, depending on the scaling between the complex shift and the frequency [39]. Although these algorithms do not result in a lower computational complexity, they are highly parallelizable, resulting in low run-times.

Within the geophysical community, the analytic incomplete LU (AILU) method was explored in [65], [64]. A variant of Kaczmarz preconditioners [43] has been studied and applied to time-harmonic wave equations by [53]. Another class of methods, called hybrid direct-iterative methods, have been explored by [73]. Although these solvers have, in general, relatively low memory consumption they tend to require many iterations to converge, thus hindering practical run-times.

Domain decomposition methods for solving partial differential equations (PDEs) have a long history [71], [54]. The first application of domain decomposition to the Helmholtz problem was proposed by [23], which inspired the development of various domain decomposition algorithms, which are now classified as Schwarz algorithms.2 However, the convergence rate of such algorithms is strongly dependent on the boundary conditions prescribed at the interfaces between subdomains [35]. The subsequent introduction of the optimized Schwarz framework in [33], which uses optimized boundary conditions to obtain good convergence, has inspired several competing approaches, including, but not limited to [34], [11], [37], [38], [36].

Absorbing boundary conditions for domain decomposition schemes for elliptic problems are introduced by [27] and the first application of such techniques to the Helmholtz problem traces back to the AILU factorization [31]. The sweeping preconditioner, introduced in [25], [26], was the first algorithm to show that those ideas could yield algorithms with quasi-linear complexity, leading to several related algorithms with similar claims such as the source transfer preconditioner [16], the rapidly converging domain decomposition [75] and its extensions [77], the double sweep preconditioner [86], [63], and the method of polarized traces [91]. For an extensive review on sweeping-type methods we direct the reader to [32].

To our knowledge, all domain decomposition methods and sweeping preconditioners have been based on layered domain decompositions, with one exception. In [51], a q×q CDD is considered in the context of the source transfer method. Their proposed strategy requires p=q2 processors to obtain the sub-optimal parallel complexity of O((N/p)logN), in both 2D and 3D. The primary difference between this method and our method is the sweeping strategy. The strategy employed in [51] is inspired by classical domain decomposition methods, while our method is inspired by sweeping preconditioners. Consequently, our method inherits a superior sweeping strategy.

We consider problems formulated within the framework of (1), where we choose to model absorbing boundary conditions using perfectly matched layers (PMLs) [7], [48]. To preserve the solution in all of Ωbulk, using a PML requires the domain to be extended, which in turn also implies that the solution, material property, and source spaces must also be extended. The extended domain, Ωextended, contains all of Ωbulk, as illustrated in Fig. 1a and a new boundary value problem is formulated on Ωextended,div(Λu)ω2mu=fin Ωextended,u=0on Ωextended, where, for brevity, we re-use the symbols u, m, and f to represent the extended solution wavefield, slowness, and source distribution. As is customary for PMLs in domains with varying wave speed, we assume that the source density, f, is extended by zero into Ωextended and the squared slowness, m, is extended into Ωextended along the normal direction of Ωbulk in a constant fashion, as illustrated in Fig. 1b. The PML-extended Helmholtz equation (2) is reduced to (1) in Ωbulk by imposing that Λ is the identity matrix in Ωbulk. In the PML region, Ωextended\Ωbulk, Λ is a complex valued diagonal matrix which depends on the PML formulation, and m and f are complex-valued functions, obtained from imposing the PML. Details of the precise formulation used in our developments are provided in Appendix A. Equation (2) is effectively a complex-valued boundary-value problem in Ωextended with homogeneous Dirichlet boundary conditions. In the remainder of the discussion, we consider (2) to be the canonical problem and therefore, for simplicity of notation, denote Ωextended as Ω.

When (2) is discretized, we obtain the linear algebraic systemAu=f where A is the model-dependent system matrix, u is the solution vector, and f is the vector of the source density f. In this paper we restrict our discussion of absorbing boundary conditions to PMLs and discretization to finite-difference methods. These restrictions are merely to simplify the exposition: other transparent boundary conditions, such as absorbing layers or sponge layers, and other discretizations, such as higher-order finite differences and those derived from finite element methods, may be used in this framework. For 2D problems we use standard 5-point finite difference stencils and in 3D, we use 9-point stencils. As a result, the discretization has N global degrees-of-freedom and n=N12 and n=N13 degrees-of-freedom in each spatial dimension for two- and three-dimensional problems respectively.

For the PMLs we use a cubic PML profile function. As we increase the frequency of the problem, we do not increase the width of the PML. Instead, we choose the PML width so that the number of wavelengths is constant in the PML region, and increase the absorption constant logarithmically with the frequency. This is motivated by an analysis of the PML [12], where this choice is rigorously justified for a more complex PML-profile. In our work, we employ the same strategy for the cubic PML-profile and obtain satisfactory results. Details on the construction of the linear system and the PMLs are given in Appendix A.

The method of polarized traces was introduced as a solver [91] and then as a preconditioner [94] for the linear system (3). At its core, the method of polarized traces spatially subdivides the discrete degrees-of-freedom in Ω into layers and computes an approximate solution to the global wavefield by sweeping over the layers and solving a local discrete half-space problem in each layer. Following [91], the solutions of the half-space problems are called polarized wavefields. In this work, we make extensive use of this concept and therefore provide a brief review, in the continuous setting, in this section. In Section 1.4, we present a similar treatment in the discrete context.

Consider the boundary-value problem (2) with a source density function f supported only in Ωbulk. Let Γ be an interface dividing Ω into two regions, Ω1 and Ω2, such that the support of f lies entirely within Ω1. For example, Γ could be a straight line (Fig. 2a) or an L-shaped line (Fig. 2b).

The wavefield in Ω2 can be computed from the representation formulau(x)=Γμ(y)G(x,y)dsyΓλ(y)[n(y)(Λ(y)yG(x,y))]dsyxΩ2, whereλ=u|Γ,andμ=[n(Λu)]|Γ are the Dirichlet and Neumann traces on Γ, and G(x,y) is the Green's function corresponding to the problem (2), i.e., for xΩdivy(Λ(y)yG(y,x))ω2m(y)G(y,x)=δ(xy)yΩ,G(y,x)=0yΩ. This formula directly follows from the divergence theorem and properties of the Green's function, assuming that Λ is a diagonal matrix.

Equation (4) requires knowledge of the Dirichlet and Neumann traces of the solution almost everywhere on Γ. Thus, corners in Γ are admissible, as in the quarter-space problem, even though the normal is not uniquely defined. Using (4), the solution u can then be computed on Ω2. Extending equation (4) to all of Ω,U(x)=Γμ(y)G(x,y)dsyΓλ(y)[n(y)(Λ(y)yG(x,y))]dsyxΩ, it can be shown that U vanishes on Ω1,U(x)=0for xΩ1. Following the terminology of [91], we call (6) the annihilation condition, Γ the polarization interface, and U a polarized wavefield. A proof of the annihilation condition (6) is provided in Appendix B.

A discrete counterpart of the polarized wavefield U can be derived by constructing a discrete solution that satisfies the discrete analogue to the annihilation condition in (6). Consider the discretization points corresponding to degrees-of-freedom in (3), as well as an interface Γ which does not intersect with any discretization point, as illustrated in Fig. 3 for both the half-space and quarter-space subdomains. Given a discretization-dependent distance δ (e.g., δ=1 for a classical 5-point finite difference stencil in 2D), such an interface divides the degrees-of-freedom into four sets, as labeled in Fig. 3:

  • 1.

    Γ1, the set of all degrees-of-freedom physically contained in Ω1 and δ-adjacent to Γ;

  • 2.

    Ω1, the set of all degrees-of-freedom physically contained in Ω1, excluding Γ1;

  • 3.

    Γ2, the set of all degrees-of-freedom physically contained in Ω2 and δ-adjacent to Γ;

  • 4.

    Ω2, the set of all degrees-of-freedom physically contained in Ω2, excluding Γ2.

Upon reordering the degrees-of-freedom with respect to these four sets, the discrete system (3) can be rewritten as(AΩ1Ω1AΩ1Γ100AΓ1Ω1AΓ1Γ1AΓ1Γ200AΓ2Γ1AΓ2Γ2AΓ2Ω200AΩ2Γ2AΩ2Ω2)(uΩ1uΓ1uΓ2uΩ2)=(fΩ1fΓ1fΓ2fΩ2), where AΓ1Ω1 denotes the slice of the matrix A generated from the rows corresponding to Γ1 and the columns corresponding to Ω1. The remaining submatrices are similarly generated. In the same way, the vector uΩ2 (and similarly all other vectors) denotes the slice of the vector u with respect to Ω2. As in the continuous case, we consider the case without sources in Ω2, i.e., fΩ2=0 and fΓ2=0. Then, due to the invertibility of A, it is easy to see that the solution U of the linear system(AΩ1Ω1AΩ1Γ100AΓ1Ω1AΓ1Γ1AΓ1Γ200AΓ2Γ1AΓ2Γ2AΓ2Ω200AΩ2Γ2AΩ2Ω2)U=(0AΓ1Γ2uΓ2AΓ2Γ1uΓ10) satisfiesU=(00uΓ2uΩ2). Consequently, the discrete counterpart of the polarized wavefield U isU=(AΩ1Ω1AΩ1Γ100AΓ1Ω1AΓ1Γ1AΓ1Γ200AΓ2Γ1AΓ2Γ2AΓ2Ω200AΩ2Γ2AΩ2Ω2)1(0AΓ1Γ2uΓ2AΓ2Γ1uΓ10).

From the right-hand side of (8), one can easily see that knowledge of both uΓ1 and uΓ2 is required in order to compute U. By construction, these sets contain information about the discrete wavefield and its normal derivative in the vicinity of Γ. Thus, as with the continuous case, the discrete case also requires information about the Dirichlet and Neumann traces to compute the polarized wavefield U in Ω2. In fact, [91] demonstrates, using similar techniques, that a discrete counterpart of the representation formula (4) can be derived.

Finally, our technique is easily extended to more general discretization techniques, as long as they allow for a reordering of the degrees-of-freedom to the block-tridiagonal system (7). Depending on the discretization, change in the selection of the sets Γ1 and Γ2 may be required. Similar schemes have been applied for many different discretizations such as high-order finite difference methods [95], finite element methods [90], enriched finite element methods [30], discontinuous Galerkin methods [79], and integral representations [93]. For example, for higher-order finite difference methods the stencils centered at the points in Ω2 (respecting Γ2, Γ1, or Ω1) cannot involve discretization points in Γ1 (respecting Ω1, Ω2, Γ2). This is easily enforced by defining an appropriate δ, e.g., δ=2 for a 9-point, 5x5, stencil in 2D. Similarly, in finite element or discontinuous Galerkin methods, the sparsity of the system matrices can be exploited in order to obtain suitable sets of degrees-of-freedom.

In Section 2, we introduce the algorithm to compute an approximate global solution u of the boundary value problem (2) with constant squared slowness m. We introduce the algorithm first on the continuous level in Section 2.1 and then extend it to the discrete level in Section 2.2. In Section 3, we show how the algorithm can be used to precondition the linear system (3). This opens the possibility of using the algorithm as part of an optimally parallel scaling solver based on a preconditioned GMRES method. We conclude Section 3 with a complexity analysis of this solver with regards to computational and communication effort. In Section 4, we analyze the effects of heterogeneous wave speeds on the effectiveness of the preconditioner. In Section 5, we provide several numerical examples in two- and three-dimensions for constant and non-constant wave speeds to corroborate all claims. The paper is concluded by a discussion of possible extensions of the method and a summary of the result. Additional proofs and implementation details are provided in the Appendices. Pseudo-code for the proposed algorithms is included as supplementary material, a C++ implementation of the 2D algorithm is available at [80], and experimental setups for the 2D examples are available at [81].

Section snippets

L-sweeps: reconstruction of wavefields

In this section, we introduce the algorithm to compute the global solution u of the boundary value problem (2). This solution is obtained from local solutions of local problems defined over a CDD. The concept of polarization introduced in Section 1 plays a crucial role in this procedure. We introduce the algorithm for constant wave speeds at the continuous and the discrete level in Sections 2.1 and 2.2. The procedure can be applied to problems with non-constant wave speeds in an analogous

Implementation and complexity

The procedure introduced in Section 2 produces an approximation of a global solution to (3). This approach therefore defines the approximate solution operator A˜1, such that A˜1(f)A1f. Thus, we use A˜1 to precondition (3),A˜1(Au)=A˜1(f), and use a Krylov subspace method, such as GMRES [70] or BiCG-stab [85], to solve (15).

The resulting preconditioned iterative solver has the following properties:

  • the preconditioner A˜1 can be applied with optimal parallel complexity, O(N/p), and

  • Krylov

Heterogeneities

As developed in Section 2, the wavefield computed by the proposed algorithm can be interpreted as a sum of global solutions, each of which is induced by a global source density supported in a single subdomain only. We denote a source density supported in the subdomain Ωij by fij and the global wavefield induced by this source density as ufij. Accordingly, the global wavefield u can be written asu=i=1qj=1rufij. To understand the effect of heterogeneities on the effectiveness of the

Numerical examples

In this section, we consider several numerical examples to corroborate the claims of this paper. All numerical examples are constructed using variations of a standard setup. The problems are posed on the unit square (d=2) or the unit cube (d=3) and the wave speed is scaled such that the squared slowness, m(x)[m0,1]xΩ for some m0>0. For the characteristic frequency ω, the minimum wavelength is 2π/ω and the maximum wavelength is 2π/(m0ω). All problems are discretized using a uniform

Discussion

We have introduced a parallel scalable algorithm to solve the high-frequency Helmholtz equation. To keep length of the paper reasonable, we have omitted some details, including the analysis of the continuum version of the algorithm in constant media, where the method essentially reduces to a direct solver, and the effect of pollution errors and associated refined discretization on the proposed algorithm. In this section we comment on both cases.

Summary

We have introduced the first solver for the high-frequency Helmholtz equation that scales as O((N/p)logω) in a distributed-memory parallel computational environment, for a single right-hand side. The new solver constructs a preconditioner based on a CDD. This approach reveals parallelism which is exploited to obtain the optimal parallel complexity. The performance of the preconditioner is similar to the well-established method of polarized traces, i.e., the number of iterations grows as O(logω

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The authors thank Total SA for support and for permission to release the example code. LD is also supported by AFOSR grant FA9550-17-1-0316. The BP model is provided courtesy of BP and Fréderic Billette. We thank NERSC for computation resources. MT thanks the Institute of Applied Mathematics at Graz University of Technology for hosting him for part of this research.

References (95)

  • L. Zepeda-Núñez et al.

    The method of polarized traces for the 2D Helmholtz equation

    J. Comput. Phys.

    (2016)
  • P. Amestoy et al.

    Fast 3D frequency-domain full-waveform inversion with a parallel block low-rank multifrontal direct solver: application to OBC data from the North Sea

    Geophysics

    (2016)
  • P.R. Amestoy et al.

    A fully asynchronous multifrontal solver using distributed dynamic scheduling

    SIAM J. Matrix Anal. Appl.

    (2001)
  • D. Aruliah et al.

    Multigrid preconditioning for Krylov methods for time-harmonic Maxwell's equations in three dimensions

    SIAM J. Sci. Comput.

    (2002)
  • A.V. Astaneh et al.

    A two-level domain decomposition method with accurate interface conditions for the Helmholtz problem

    Int. J. Numer. Methods Eng.

    (2016)
  • G. Ballard et al.

    Minimizing communication in numerical linear algebra

    SIAM J. Matrix Anal. Appl.

    (2011)
  • M. Bebendorf

    Hierarchical Matrices: A Means to Efficiently Solve Elliptic Boundary Value Problems

    (2008)
  • F. Billette et al.

    The 2004 BP Velocity Benchmark

    (2005)
  • S. Boerm et al.

    Hierarchical Matrices

    (2006)
  • J. Bramble et al.

    Analysis of a finite PML approximation for the three dimensional time-harmonic Maxwell and acoustic scattering problems

    Math. Comput.

    (2007)
  • A. Brandt et al.

    Wave-ray multigrid method for standing wave equations

    Electron. Trans. Numer. Anal.

    (1997)
  • H. Calandra et al.

    An improved two-grid preconditioner for the solution of three-dimensional Helmholtz problems in heterogeneous media

    Numer. Linear Algebra Appl.

    (2013)
  • T.F. Chan et al.

    Domain decomposition algorithms

    Acta Numer.

    (1994)
  • Z. Chen et al.

    A source transfer domain decomposition method for Helmholtz equations in unbounded domain

    SIAM J. Numer. Anal.

    (2013)
  • T.A. Davis

    Algorithm 832: UMFPACK v4.3—an unsymmetric-pattern multifrontal method

    ACM Trans. Math. Softw.

    (June 2004)
  • M.V. de Hoop et al.

    On 3D modeling of seismic wave propagation via a structured parallel multifrontal direct Helmholtz solver

    Geophys. Prospect.

    (2011)
  • A. de La Bourdonnaye et al.

    A non-overlapping domain decomposition method for the exterior Helmholtz problem

    Contemp. Math.

    (1998)
  • J. Demmel et al.

    Communication avoiding rank revealing qr factorization with column pivoting

    (May 2013)
  • J.W. Demmel et al.

    A supernodal approach to sparse partial pivoting

    SIAM J. Matrix Anal. Appl.

    (1999)
  • B. Després

    Décomposition de domaine et problème de Helmholtz

    C. R. Séances Acad. Sci., Sér. 1 Math.

    (1990)
  • I.S. Duff et al.

    The multifrontal solution of indefinite sparse symmetric linear

    ACM Trans. Math. Softw.

    (September 1983)
  • B. Engquist et al.

    Sweeping preconditioner for the Helmholtz equation: hierarchical matrix representation

    Commun. Pure Appl. Math.

    (2011)
  • B. Engquist et al.

    Sweeping preconditioner for the Helmholtz equation: moving perfectly matched layers

    Multiscale Model. Simul.

    (2011)
  • Y.A. Erlangga et al.

    A novel multigrid based preconditioner for heterogeneous Helmholtz problems

    SIAM J. Sci. Comput.

    (2006)
  • O.G. Ernst et al.

    Why it is difficult to solve Helmholtz problems with classical iterative methods

  • J. Fang et al.

    Learning dominant wave directions for plane wave methods for high-frequency Helmholtz equations

    Res. Math. Sci.

    (May 2017)
  • M. Gander et al.

    A class of iterative solvers for the Helmholtz equation: factorizations, sweeping preconditioners, source transfer, single layer potentials, polarized traces, and optimized Schwarz methods

    SIAM Rev.

    (2019)
  • M.J. Gander

    Optimized Schwarz methods

    SIAM J. Numer. Anal.

    (2006)
  • M.J. Gander et al.

    Optimal interface conditions for an arbitrary decomposition into subdomains

  • M.J. Gander et al.

    Optimized Schwarz methods without overlap for the Helmholtz equation

    SIAM J. Sci. Comput.

    (2002)
  • M.J. Gander et al.

    Optimized Schwarz method with two-sided transmission conditions in an unsymmetric domain decomposition

  • M.J. Gander et al.

    Domain decomposition methods for the Helmholtz equation: a numerical investigation

  • M.J. Gander et al.

    Optimized Schwarz methods with overlap for the Helmholtz equation

  • M.J. Gander et al.

    Applying GMRES to the Helmholtz equation with shifted Laplacian preconditioning: what is the largest shift for which wavenumber-independent convergence is guaranteed?

    Numer. Math.

    (2015)
  • A. George

    Nested dissection of a regular finite element mesh

    SIAM J. Numer. Anal.

    (1973)
  • S. Ghanemi

    A domain decomposition method for Helmholtz scattering problems

  • A. Gillman et al.

    A spectrally accurate direct solution technique for frequency-domain scattering problems with variable media

    BIT Numer. Math.

    (Mar 2015)
  • Cited by (40)

    • Conditioning analysis for discrete Helmholtz problems

      2022, Computers and Mathematics with Applications
      Citation Excerpt :

      Standard Krylov subspace solvers are ineffective in obtaining the solution of the discrete Helmholtz problem for large wave numbers because of indefiniteness of the linear systems [27]. Different preconditioners, such as multigrid [21,25,3,6,8,9,11,17,19,47], domain decomposition methods [33,49,4,5,7,10,12,13,18,29,36], sweeping preconditioners [35,22,23,28,52], shifted Laplace preconditioners [26,32,54] and others [14,24,34,20] have been considered to accelerate the convergence. However, these preconditioners are also not as effective as a preconditioner designed for the Laplace equation when the wave number is large.

    View all citing articles on Scopus
    View full text