Introduction

For decades, density functional theory (DFT) has been the workhorse of first-principles materials science. Immense efforts have gone into the development of improved exchange–correlation (xc)-functionals and today hundreds of different types exist, including the generalized gradient approximations (GGA), meta GGAs, (screened) hybrid functionals, Hubbard corrected functionals (local density approximation (LDA)/GGA+U), and the non-local van der Waals (vdW) density functionals. Typically, these contain several parameters that have been optimized for a particular type of problem or class of materials. Moreover, they rely on error cancellation between the approximations used for exchange and correlation, which is highly unsatisfactory from a fundamental point of view, since the exact expression for the exchange part is well known. This limits the universality and predictive power of commonly applied xc-functionals and the accuracy is often highly system dependent.

At the highest rung of the current hierarchy of xc-functionals lie those based directly on the adiabatic-connection fluctuation-dissipation theorem (ACFDT). The ACFDT provides an exact expression for the electronic correlation energy in terms of the interacting density response function.1,2 An attractive feature of the ACFDT is that it provides the pure correlation energy, which should then be combined with the exact exchange energy. This clear division removes the reliance on error cancellation between the exchange and correlation terms, which is significant (and uncontrolled) in the lower rung xc-functionals. A further advantage of the ACFDT is that, even in its simplest form, it captures dispersive interactions very accurately through the non-locality of the response function.

The simplest approximation to the response function beyond the non-interacting one is the random phase approximation (RPA). The RPA generally provides an excellent account of long-range screening and it cures the pathological divergence of second-order perturbation theory for the homogeneous electron gas (HEG). However, an important shortcoming of the RPA response function is that the local (\(r\) close to \(r^{\prime}\)) correlation hole derived from it, via the ACFDT, is much too deep leading to a drastic overestimation of the absolute correlation energy by several tenths of an eV per electron. This key observation is responsible for most of the failures of the RPA and GW schemes to be discussed in this review. It occurs because the RPA response function only accounts for the Hartree component of the induced potentials. The neglected xc-component of the induced potential is short ranged in nature and therefore mainly influences the local shape of the correlation hole.

Early work3,4,5 applied the ACFDT-RPA to compute the dissociation energies of small molecules, finding a systematic tendency of the RPA to underbind and generally have a lower accuracy than the GGA. It was also demonstrated that RPA accounts well for strong static correlation and correctly describes the dissociation curve of the N\({}_{2}\) molecule. Around 10 years later, the RPA was applied to calculate cohesive energies of solids,6 again finding that RPA performs significantly worse than GGA with a systematic tendency to underbind. In contrast, RPA was found to produce excellent results for structural parameters of solids7,8 as well as bond energies in vdW systems like graphite9 and noble gas solids,10 which are poorly described by semilocal approximations. In addition, for the case of graphene adsorbed on metal surfaces, where dispersive and covalent interactions are equally important, the RPA seems to be the only non-empirical method capable of providing correct potential energy curves.11,12 While the RPA method has many attractive features, it is clear that its poor description of short-range correlations, which results in overestimation of absolute correlation energies and underestimation of covalent bond strengths, disqualifies it as the highly desired fully ab initio and universally accurate total energy method.

One approach to improving the RPA total energy is based on the idea of correcting the RPA self-correlation energy by including higher-order exchange terms in many-body perturbation theory (MBPT). Second-order exchange is the leading (of order \({e}^{4}\)) exchange correction to the RPA correlation energy, and within the coupled-cluster doubles formalism, Grüneis et al.13 constructed second-order screened exchange (SOSEX). SOSEX can be interpreted within the renormalized RPA, as an expansion of the correlation energy in powers of the xc-kernel. It can also be expressed in terms of different approximations to the exchange kernel, such as the AXK kernel of Bates and Furche,14 the NEO kernel of Bates et al.,15 or the version of Hellgren et al.16 In addition, SOSEX can be treated within the range-separation scheme using a fixed admixture of local correlation.17 Ángyán et al.18 constructed adiabatic-connection SOSEX or ACSOSEX, which brings in higher-order exchange terms and Hümmel et al. have proposed an adjacent pairs exchange correction, which goes beyond ACSOSEX19 by introducing more than one antisymmetrized line in terms of Goldstone diagrams. In general, the SOSEX correlation energy vanishes for one-electron systems and improves the accuracy of covalent bonds slightly compared to RPA. However, it causes a deterioration in the good description of static correlation and barrier heights in the RPA.13,20 In addition, SOSEX scales as \({N}^{5}\) with system size and therefore comprises a significant computational challenge compared to RPA, which scales as \({N}^{4}\).

A different strategy is based on the use of time-dependent density functional theory (TDDFT) to construct better response functions. The crucial ingredient in this theory is the xc-kernel, \({f}_{{\mathrm{xc}}}({\bf{r}},{\bf{r}}^{\prime} ,\omega )\), which is the functional derivative of the time-dependent xc-potential with respect to the density. In particular, the inclusion of a time-dependent exact exchange (EXX) kernel eliminates the self-correlation energy in RPA and is exact for any one-electron problem. In ref., 21 it was shown that this approach significantly improves the RPA for small molecules. Moreover, the RPA with EXX is so far the only single-reference method that can correctly describe the dissociation limit of both H\({}_{2}\) and H\({}_{2}^{+}\)22. More recently, perturbation theory along the adiabatic connection was shown to comprise a systematic approach to improving the accuracy of correlation energies in the context of the ACFDT.23,24 However, the kernels derived in this framework are inherently frequency dependent, they are not given by a closed expression involving orbitals, and the method is rather computationally demanding.

In ref. 25 it was argued, based on ACFDT calculations for the HEG, that the frequency dependence of the xc-kernel is of minor importance for the correlation energy, while the spatial non-locality is crucial. Moreover, it has been shown that any local approximation to the xc-kernel produces a correlation hole, which diverges at the origin.26 As a consequence, the use of local xc-kernels typically results in correlation energies that are worse than those obtained with the RPA. Several non-local approximations to the xc-kernel of the HEG have been proposed. On a scale given by the error of RPA, they seem to perform very similarly for the ground state correlation energy, and therefore the present review will focus on one specific form, namely, the renormalized adiabatic kernels of refs., 27,28,29 rAX, where X refers to a ground state xc-functional.

In the present review, we show that the use of non-local (but frequency-independent) kernels largely fixes the erroneous RPA correlation hole and provides a much better description of short-range correlations—at least for weakly correlated materials. This implies that absolute correlation energies are much better and thus the reliance on error cancellation when forming energy differences is lifted. Specifically, covalent bond energies are greatly improved while the good performance of RPA for dispersive interactions is preserved. The performance of the rAX kernels is further discussed for structural parameters, atomization energies of molecules, cohesive energies of solids, formation energies of metal oxides, surface and adsorption energies, molecular dissociation curves, static correlation, and structural phase transitions.

In addition to total energy calculations, the renormalized kernels have also been used to incorporate vertex corrections into self-energy-based quasiparticle (QP) band structure calculations.30 The formal basis of such calculations is constituted by Hedin’s equations, a coupled set of equations for the key quantities of a perturbative treatment of the single-particle Green’s function, \(G\), in terms of the screened Coulomb interaction, \(W\). Within the widely used GW approximation, the vertex corrections are completely ignored. Despite this omission, the GW approximation typically yields good results for the QP band gap.31,32,33,34,35,36 Vertex corrections evaluated from the SOSEX diagram have been found to yield some improvement of band gaps and, in particular, ionization potentials of solids.37 This is clearly very satisfactory from a theoretical point of view. The drawback of this approach, however, is the high complexity of the formalism, the concomitant loss of physical transparency, and the significant computational overhead as compared to the conventional GW method. Just like the two-point xc-kernels from TDDFT provide a computationally tractable strategy to improve total energies, they can also be used to approximate vertex corrections in the electron self-energy.38 As for the ground state energy, the local xc-kernels perform rather badly.30 Instead, the renormalized kernels yield a major improvement over the GW method when it comes to ionization potentials and electron affinities, i.e., absolute band energies, as a result of the superior description of short-range correlations. This can be understood as a direct consequence of the systematic underestimation of the absolute correlation energy by the RPA. In contrast, the QP gap is only slightly affected by the vertex because it is mainly governed by long-range correlations. This in fact explains the success of the GW approximation in describing QP band gaps.

An important common feature of the ACFDT and Hedin’s equations is that they are typically implemented non-self-consistently starting from some mean field Hamiltonian. This means that the results of such calculations acquire a starting point dependence. While the LDA and GGAs are the most widely used starting points, other xc-functionals, such as exact exchange hybrids39,40 and GGA+U,41,42,43 have also been employed. In general, it has been found, for both GW and RPA, that the results are quite insensitive to the starting point; in particular, they are much less sensitive to the initial mean field than the mean field itself, e.g., the band gap obtained with GW@GGA+U and the lattice constants, and energy differences determined from RPA@GGA+U vary much less with U as compared to the GGA+U result itself.41,42,44 This is clearly a desirable effect but does not remove the fundamentally disturbing starting point dependence. The problem arises because the natural starting point for GW and RPA would be the Hartree mean field solution, which is notoriously bad. The situation is somewhat improved by the rAX kernels because their consistent starting point would resemble a DFT Hamiltonian with the X-functional (or more precisely a weighted density approximation to the X-functional, see supplementary of ref. 29). We note in passing that ideally the calculations should be performed self-consistently. However, this is rarely done in practice and involves other problems such as overestimated band gaps and smeared out spectral features in GW and technical difficulties associated with the self-consistent determination of the RPA optimized effective potential.

Here we focus on the theory, implementation, and implications of physics beyond the RPA and GW methods as described by static non-local xc-kernels from TDDFT. Consequently, we will not dwell on the RPA and GW methods themselves but refer the interested reader to one of the existing reviews on these topics.20,45,46,47,48 The paper is organized as follows. In section “Theory,” we present the basic theory of ground state and QP energy calculations based on the adiabatic connection fluctuation dissipation theorem and Hedin’s equations, respectively. We introduce several non-local xc-kernels for the HEG and describe a renormalization procedure for constructing non-local xc-kernels from (semi)local xc-functionals. In section “Implementation,” we describe the numerical implementation of non-local xc-kernels including different strategies for generalizing HEG kernels to inhomogeneous densities and some aspects of \(k\)-point and basis set convergence. In section “Results,” we present a series of results serving to illustrate the effect and importance of the xc-kernels for both total energies and QP band structures. Specifically, we assess the performance of the renormalized adiabatic local density approximation (rALDA) and renormalized adiabatic Perdew–Burke–Ernzerhof (rAPBE) xc-kernels for structural parameters of solids, atomization energies of covalently bonded solids and molecules, oxide formation energies, vdW bonding, dissociation of statically correlated atomic dimers, surface and chemisorption energies, structural phase transitions, and QP energies of bulk and two-dimensional semiconductors. Finally, our conclusions and outlook are provided in the last section.

Theory

The adiabatic connection fluctuation dissipation theorem

In the Kohn–Sham (KS) scheme, a non-interacting Hamiltonian is constructed such that it has a ground state Slater determinant \(\left|{\varphi }_{0}\right\rangle\), which yields the same ground state density as the true ground state wavefunction \(\left|{\psi }_{0}\right\rangle\). The adiabatic connection denotes a generalization of this scheme where the Coulomb interaction \({v}_{{\mathrm{c}}}\) is rescaled by \(\lambda\), such that the ground state wavefunction \(\left|{\psi }_{0}^{\lambda }\right\rangle\) reproduces the electronic density of \(\left|{\psi }_{0}\right\rangle\). The procedure can be accomplished by modifying the external potential \({v}_{{\mathrm{KS}}}^{\lambda }({\bf{r}})\) and we have \(\left|{\psi }_{0}^{\lambda =1}\right\rangle =\left|{\psi }_{0}\right\rangle\) and \(\left|{\psi }_{0}^{\lambda =0}\right\rangle =\left|{\varphi }_{0}\right\rangle\).

The adiabatic connection allows one to obtain a highly useful expression for the correlation energy. To begin with, the Hartree xc (Hxc) energy can be written as49

$$\begin{array}{rcl}{E}_{{\mathrm{Hxc}}}={E}_{{\mathrm{tot}}}-{T}_{{\mathrm{KS}}}-{v}_{{\mathrm{ext}}}\\=\langle {\psi }_{0}| \hat{T}+{\hat{v}}_{{\mathrm{c}}}| {\psi }_{0}\rangle -\langle {\varphi }_{0}| \hat{T}| {\varphi }_{0}\rangle \\=\langle {\psi }_{0}^{\lambda }| \hat{T}+\lambda {\hat{v}}_{{\mathrm{c}}}| {\psi }_{0}^{\lambda }\rangle {\left.\right|}_{0}^{1}\\={\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda \frac{{\mathrm{d}}}{{\mathrm{d}}\lambda }\langle {\psi }_{0}^{\lambda }| \hat{T}+\lambda {\hat{v}}_{{\mathrm{c}}}| {\psi }_{0}^{\lambda }\rangle \\={\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda \langle {\psi }_{0}^{\lambda }| {\hat{v}}_{{\mathrm{c}}}| {\psi }_{0}^{\lambda }\rangle ,\end{array}$$
(1)

where \({E}_{{\mathrm{tot}}}\) is the total electronic ground state energy, \({T}_{{\mathrm{KS}}}\) is KS kinetic energy, and \({v}_{{\mathrm{ext}}}\) is the expectation value of the external potential. In the last quality, we used the Hellmann–Feynman theorem and the fact that \({\psi }_{0}^{\lambda }\) is defined as the state that minimizes the expectation value of \(T+\lambda {v}_{{\mathrm{c}}}\). Inserting the second quantized form of the Coulomb interaction, the expression becomes

$$\begin{array}{rcl}{E}_{{\mathrm{Hxc}}}&=&\frac{1}{2}{\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda \int \frac{{\mathrm{d}}{\bf{r}}{\mathrm{d}}{\bf{r}}^{\prime} }{| {\bf{r}}-{\bf{r}}^{\prime} | }{\langle {\hat{\Psi }}^{\dagger }({\bf{r}}){\hat{\Psi }}^{\dagger }({\bf{r}}^{\prime} )\hat{\Psi }({\bf{r}}^{\prime} )\hat{\Psi }({\bf{r}})\rangle }_{\lambda }\\ &=&\frac{1}{2}{\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda \int \frac{{\mathrm{d}}{\bf{r}}{\mathrm{d}}{\bf{r}}^{\prime} }{| {\bf{r}}-{\bf{r}}^{\prime} | }\left[{\langle {\hat{\Psi }}^{\dagger }({\bf{r}})\hat{\Psi }({\bf{r}}){\hat{\Psi }}^{\dagger }({\bf{r}}^{\prime} )\hat{\Psi }({\bf{r}}^{\prime} )\rangle }_{\lambda }-\delta ({\bf{r}}-{\bf{r}}^{\prime} ){\langle \hat{n}({\bf{r}})\rangle }_{\lambda }\right].\end{array}$$
(2)

where \(\hat{n}({\bf{r}})={\Psi }^{\dagger }({\bf{r}})\Psi ({\bf{r}})\) and we have introduced the notation \({\langle \ldots \rangle }_{\lambda }=\langle {\psi }_{0}^{\lambda }| \ldots | {\psi }_{0}^{\lambda }\rangle\). Since the last term is independent of \(\lambda\), we can get rid of it by subtracting the Hartree exchange energy \({E}_{{\mathrm{Hx}}}\), which is given by a similar expression with \(\left|{\psi }_{0}^{\lambda }\right\rangle\) replaced by \(\left|{\varphi }_{0}\right\rangle\). We then have

$${E}_{{\mathrm{c}}}=\frac{1}{2}{\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda \int \frac{{\mathrm{d}}{\bf{r}}{\mathrm{d}}{\bf{r}}^{\prime} }{| {\bf{r}}-{\bf{r}}^{\prime} | }\left[{\langle \hat{n}({\bf{r}})\hat{n}({\bf{r}}^{\prime} )\rangle }_{\lambda }-{\langle \hat{n}({\bf{r}})\hat{n}({\bf{r}}^{\prime} )\rangle }_{0}\right]$$
(3)

The density–density correlation function is closely related to the density–density response function. The retarded response at vanishing temperature is defined by

$${\chi }_{\lambda }({\bf{r}},{\bf{r}}^{\prime} ;t,t^{\prime} )=-i\theta (t-t^{\prime} ){\langle [\hat{n}({\bf{r}},t),\hat{n}({\bf{r}}^{\prime} ,t^{\prime} )]\rangle }_{\lambda },$$
(4)

where the expectation value is with respect to the ground state. In the frequency domain, it becomes

$${\chi }_{\lambda }({\bf{r}},{\bf{r}}^{\prime} ;\omega )=\sum _{m\ne 0}\left[\frac{{n}_{0m}^{\lambda }({\bf{r}}){n}_{m0}^{\lambda }({\bf{r}}^{\prime} )}{\omega -{E}_{m0}+i\eta }-\frac{{n}_{0m}^{\lambda }({\bf{r}}^{\prime} ){n}_{m0}^{\lambda }({\bf{r}})}{\omega +{E}_{m0}+i\eta }\right],$$
(5)

where \({n}_{0m}({\bf{r}})=\langle {\psi }_{0}^{\lambda }| \hat{n}({\bf{r}})| {\psi }_{m}^{\lambda }\rangle\), \({E}_{m0}^{\lambda }={E}_{m}^{\lambda }-{E}_{0}^{\lambda }\) are the eigenvalue differences, and \(\eta\) is a positive infinitesimal. It is then clear that

$$\begin{array}{rcl}\frac{-1}{\pi }{\int\nolimits_{0}^{\infty}}{\mathrm{d}}\omega {\rm{Im}}{\chi }_{\lambda }({\bf{r}},{\bf{r}}^{\prime} ;\omega )&=&{\sum \limits_{m\ne 0}}{n}_{0m}^{\lambda }({\bf{r}}){n}_{m0}^{\lambda }({\bf{r}}^{\prime} )\\ &=&{\langle \hat{n}({\bf{r}})\hat{n}({\bf{r}}^{\prime} )\rangle }_{\lambda }-n({\bf{r}})n({\bf{r}}^{\prime} )\\ &=&{\langle \delta \hat{n}({\bf{r}})\delta \hat{n}({\bf{r}}^{\prime} )\rangle }_{\lambda },\end{array}$$
(6)

with \(\delta \hat{n}({\bf{r}})\equiv \hat{n}({\bf{r}})-n({\bf{r}})\). The equality is an example of a fluctuation dissipation theorem, since it relates the imaginary (dissipative) part of the density response to the correlation between density fluctuations.

The retarded response function only has poles in the negative imaginary half-plane and its frequency integral on a closed loop in the upper right quarter of the complex plane vanishes since \(\chi \sim 1/| \omega {| }^{2}\) for \(| \omega | \to \infty\). We can thus switch the integration path to the positive imaginary axis where the frequency dependence is smooth. Noting that \({\chi }^{* }({\bf{r}},{\bf{r}}^{\prime} ;i\omega )=\chi ({\bf{r}}^{\prime} ,{\bf{r}};i\omega )\), we obtain

$${E}_{{\mathrm{c}}}=-{\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda {\int\nolimits_{0}^{\infty} }\frac{{\mathrm{d}}\omega }{2\pi }\langle \langle {v}_{c}{\chi }_{\lambda }(i\omega )-{v}_{c}{\chi }_{0}(i\omega )\rangle \rangle .$$
(7)

where \(\langle \langle \ldots \rangle \rangle\) indicates the trace of the two-point functions involved in the adiabatic-connection integrand.

The problem of calculating the correlation energy has thus been rephrased into finding a good approximation for the density–density response function. The simplest non-trivial approximation is the RPA, which can be obtained from MBPT by assuming a non-interacting irreducible response function. Alternatively, the full (reducible) response function can be obtained from TDDFT, where it can be shown to satisfy the Dyson equation

$${\chi }^{\lambda }(\omega )={\chi }_{{\mathrm{KS}}}(\omega )+{\chi }_{{\mathrm{KS}}}(\omega )\left[\lambda {v}_{{\mathrm{c}}}+{f}_{{\mathrm{xc}}}^{\lambda }(\omega )\right]{\chi }^{\lambda }(\omega ),$$
(8)

where all quantities are functions of two position variables and the products are shorthand notation for \(fg\equiv \int {\mathrm{d}}{\bf{r}}^{\prime} f({\bf{r}},{\bf{r}}^{\prime} )g({\bf{r}}^{\prime} ,{\bf{r}}^{\prime\prime} )\). \({f}_{{\mathrm{xc}}}(\omega )\) is the temporal Fourier transform of the xc-kernel

$${f}_{{\mathrm{xc}}}({\bf{r}},{\bf{r}}^{\prime} ,t-t^{\prime} )=\frac{\delta {v}_{{\mathrm{xc}}}({\bf{r}},t)}{\delta n({\bf{r}}^{\prime} ,t^{\prime} )}$$
(9)

and any approximation to \({f}_{\text{xc}}\) thus implies an approximation for the ground state correlation energy in the framework of the adiabatic connection combined with the fluctuation dissipation theorem. In the context of TDDFT, the RPA is simply obtained by neglecting the xc-kernel when solving Eq. (8).

In order to calculate correlation energies from Eqs. (7) and (8), it is necessary to generalize the kernel to an arbitrary coupling strength \(\lambda\). In ref., 25 it was shown that \({f}_{{\mathrm{xc}}}^{\lambda }\) can be obtained from \({f}_{{\mathrm{xc}}}\) by the rescaling

$${f}_{{\mathrm{xc}}}^{\lambda }(n,q,\omega )={\lambda }^{-1}{f}_{{\mathrm{xc}}}(n/{\lambda }^{3},q/\lambda ,\omega /{\lambda }^{2}).$$
(10)

In particular, it is straightforward to show that any bare exchange kernel satisfies \({f}_{{\mathrm{x}}}^{\lambda }=\lambda {f}_{{\mathrm{x}}}\).

RPA renormalization

While the Dyson Eq. (8) provides an exact representation of \(\chi\) for a given kernel, the solution of the equation may exhibit pathological behavior related to electronic instabilities.26,50,51 The simplest examples are for the low-density HEG and stretched diatomics, where Colonna et al.51 demonstrated that the exact-exchange kernel leads to a divergence in the density–density response function computed via Eq. (8). To avoid this problem, an exact refactorization of Eq. (8) was introduced by Bates and Furche in 2013,14

$${\chi }_{\lambda }(\omega )={\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega )+{\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega ){f}_{{\mathrm{xc}}}^{\lambda }(\omega ){\chi }_{\lambda }(\omega )\ ,$$
(11)
$$={\left[{\chi }_{\lambda ,{\mathrm{RPA}}}^{-1}(\omega )-{f}_{{\mathrm{xc}}}^{\lambda }(\omega )\right]}^{-1}.$$
(12)

The series expansion of \(\chi (\omega )\) in powers of \({\chi }_{{\mathrm{KS}}}(\omega )\) generates an unscreened perturbation theory that is equivalent to Görling–Levy Perturbation theory.52 Since the bare KS orbital energy differences appear in the denominator of the non-interacting response function, the unscreened perturbation series diverges for small gap or metallic systems.53 This divergence can be eliminated by expanding in powers of the RPA response function,

$${\chi }^{{\mathrm{RPA}}}(\omega )={\left[{\chi }_{{\mathrm{KS}}}^{-1}(\omega )-{v}_{{\mathrm{c}}}\right]}^{-1},$$
(13)

which leads to the following series

$${\chi }_{\lambda }(\omega )\approx {\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega )+{\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega ){f}_{{\mathrm{xc}}}^{\lambda }(\omega ){\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega )+\ldots$$
(14)

In addition to eliminating the divergences related to the non-interacting response function, this expansion also eliminates the electronic instabilities resulting from the kernel since inversions are never needed directly involving the xc-kernel.14,51,54

The decomposition in Eq. (11) also naturally leads to a simple partition of the correlation energy into two pieces

$${E}_{{\mathrm{c}}}[{f}_{{\mathrm{xc}}}]={E}_{{\mathrm{c}}}^{{\mathrm{RPA}}}+\Delta {E}_{{\mathrm{c}}}^{{\mathrm{bRPA}}}[{f}_{{\mathrm{xc}}}].$$
(15)

The beyond-RPA (bRPA) piece incorporates all of the terms in Eq. (14) beyond the “bare” RPA response function, which can be collected and exactly expressed as54

$$\Delta {E}_{{\mathrm{c}}}^{{\mathrm{bRPA}}}=-{\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda {\int\nolimits_{0}^{\infty}}\frac{{\mathrm{d}}\omega }{2\pi }\left\langle \left\langle {v}_{{\mathrm{c}}}{\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega ){f}_{{\mathrm{xc}}}^{\lambda }(\omega ){\chi }_{\lambda }(\omega )\right\rangle \right\rangle .$$
(16)

By truncating \({\chi }_{\lambda }\) to a low order in \({\chi }_{\lambda }^{{\mathrm{RPA}}}\), one hopes to faithfully reproduce the infinite-order correlation energy while avoiding the need to invert a function that directly contains \({f}_{{\mathrm{xc}}}\). We stress that this approximation scheme can never exceed the accuracy of the infinite-order approach for energy differences and material properties, but it does guarantee the stability of the scheme to compute the correlation energy.15 Furthermore, this division naturally separates the long-range and short-ranged contributions to the correlation energy, enabling approximations for \(\Delta {E}_{{\mathrm{c}}}^{{\mathrm{bRPA}}}\) to be added directly on top of the already robust RPA.

The first-order approximation derived from RPA renormalization, RPAr1, recovers a significant part (\(\sim\)90%) of the total bRPA correlation energy for a given kernel15,54

$$\Delta {E}_{{\mathrm{c}}}^{{\mathrm{RPAr1}}}=-{\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda {\int\nolimits_{0}^{\infty}}\frac{{\mathrm{d}}\omega }{2\pi }\left\langle \left\langle {v}_{{\mathrm{c}}}{\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega ){f}_{{\mathrm{xc}}}^{\lambda }(\omega ){\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega )\right\rangle \right\rangle .$$
(17)

This approximation has several key features: it recovers the exact, second-order correlation energy given the exact-exchange kernel, the coupling strength integral can be performed analytically for exchange-like kernels leading to efficient implementations,14,15,55 and it reduces properly to RPA for stretched bonds unlike other second-order schemes, such as SOSEX.56 In fact, within the adiabatic-connection framework, SOSEX can be obtained directly from RPAr114,15,54 through the replacement of one \({\chi }^{{\mathrm{RPA}}}\) with \({\chi }_{{\mathrm{KS}}}\) in Eq. (17)

$$\Delta {E}_{{\mathrm{c}}}^{{\mathrm{ACSOSEX}}}=-{\int\nolimits_{0}^{1}}{\mathrm{d}}\lambda {\int\nolimits_{0}^{\infty}}\frac{{\mathrm{d}}\omega }{2\pi }\left\langle \left\langle {v}_{{\mathrm{c}}}{\chi }_{\lambda }^{{\mathrm{RPA}}}(\omega ){f}_{{\mathrm{xc}}}^{\lambda }(\omega ){\chi }_{{\mathrm{KS}}}(\omega )\right\rangle \right\rangle .$$
(18)

This approximation was shown to be less consistent than RPAr1 due to the reintroduction of the KS response function for molecular energy differences14 and structural properties of simple solids.15

To recover the remaining \(\sim\)10% of the bRPA correlation energy, corrections beyond RPAr1 to the response function can be systematically added order-by-order until convergence to Eq. (8). Rather than compute these terms exactly, a simple approximation can be introduced to eliminate the coupling-strength integration and utilize information from second order to estimate third- and higher-order terms in the RPA renormalized expansion. This approximation method was termed the Higher-Order Terms (HOT) approximation57 and is obtained through a rescaling of the second-order RPAr correction at \(\lambda =1\). The HOT approximation usually reproduces the total correlation energy to within 1–2%, and, consequently, accurately reproduces the performance of a given kernel for chemical or physical properties of molecules and materials.

xc-kernels from the HEG

Although Eq. (9) provides a definition of the kernel \({f}_{{\rm{xc}}}\), the absence of an exact expression for the xc potential \({v}_{{\rm{xc}}}\) requires that an approximate form of \({f}_{{\rm{xc}}}\) must be used in practical calculations. The HEG provides a valuable testing ground for approximations of \({f}_{{\rm{xc}}}\) and allows the kernel’s limiting behavior to be studied. The analog of Eq. (8) for the HEG is

$${\chi }^{{\rm{HEG}}}(q,\omega )={\chi }_{0}(q,\omega )+{\chi }_{0}(q,\omega )\left[{v}_{{\rm{c}}}(q)+{f}_{{\rm{xc}}}^{{\rm{HEG}}}(q,\omega )\right]{\chi }^{{\rm{HEG}}}(q,\omega ),$$

where the dependence on the wavevector \(q\) has now been made explicit. \({\chi }_{0}\) is the textbook Lindhard function,58 which coincides with \({\chi }_{{\rm{KS}}}\) when Eq. (8) is applied to the HEG.59 A quantity commonly found in the HEG literature is the local field factor \(G(q,\omega )\), which is closely related to \({f}_{{\rm{xc}}}^{{\rm{HEG}}}\) as \({f}_{{\rm{xc}}}^{{\rm{HEG}}}(q,\omega )=-{v}_{{\rm{c}}}(q)G(q,\omega )\).

Theoretical work on \(G\) (and thus \({f}_{{\rm{xc}}}^{{\rm{HEG}}}\)) can be traced at least as far back as Hubbard (see Sec. IIIC of ref. 60 for a review), and exact limits have been derived for a number of cases. First, the long wavelength and static limit (\(q\to 0\), \(\omega =0\)) actually corresponds to the ALDA commonly employed in TDDFT,

$${f}_{{\rm{xc}}}^{{\rm{HEG}}}(q\to 0,\omega =0)={f}_{{\rm{xc}}}^{{\rm{ALDA}}}\equiv -\frac{4\pi A}{{k}_{{\rm{F}}}^{2}}$$
(19)

where

$$A=\frac{1}{4}-\frac{{k}_{{\rm{F}}}^{2}}{4\pi }\frac{{d}^{2}(n{E}_{{\rm{C}}})}{d{n}^{2}},$$
(20)

\({k}_{{\rm{F}}}={(3{\pi }^{2}n)}^{1/3}\) is the Fermi wavevector for the HEG of density \(n\), and \({E}_{{\rm{C}}}\) is the correlation energy per electron. The two terms in Eq. (20) correspond to exchange and correlation contributions. Eq. (19) is intuitive in stating that the ALDA should be exact in describing the HEG response to a uniform, static perturbation61 and is more formally derived from the compressibility sum rule.60 Next, the short wavelength and static limit has the form62,63

$${f}_{{\rm{xc}}}^{{\rm{HEG}}}(q\to \infty ,\omega =0)=-\frac{4\pi B}{{q}^{2}}-\frac{4\pi C}{{k}_{{\rm{F}}}^{2}},$$
(21)

while the long wavelength, high frequency limit has the form64

$${f}_{{\rm{xc}}}^{{\rm{HEG}}}(q=0,\omega \to \infty )=-\frac{4\pi D}{{k}_{{\rm{F}}}^{2}}.$$
(22)

The parameters \(A\), \(B\), \(C\), and \(D\) depend on the HEG density, which in turn can be written in terms of the Fermi wavevector or Wigner radius \({r}_{s}={(3/4\pi n)}^{1/3}\). Practically, \(A\), \(C\), and \(D\) can be obtained from a parameterization of the correlation energy \({E}_{{\rm{C}}}\) for a HEG of density \(n\), whereas \(B\) requires additional knowledge of the momentum distribution of the HEG.65

For intermediate \(q\) values, it is necessary to turn to diffusion Monte Carlo calculations. The study of ref. 66 investigated the \(q\)-dependence of the static kernel \({f}_{{\rm{xc}}}^{{\rm{HEG}}}(q,\omega =0)\) for a range of densities. A key conclusion of that work was that for wavevectors \(q\le 2{k}_{{\rm{F}}}\), \({f}_{{\rm{xc}}}^{{\rm{HEG}}}(q,\omega =0)\) remains close to its \(q=0\) value, (=\({f}_{{\rm{xc}}}^{{\rm{ALDA}}}\), Eq. (19)), while for \(q\ >\ 2{k}_{{\rm{F}}}\), the kernel can be reasonably well described by the short wavelength limit (Eq. (21)).

In the context of approximations to \({f}_{{\rm{xc}}}^{{\rm{HEG}}}\), it is worth stressing a point discussed in ref.: 59 there is no particular reason why approximate, frequency-independent kernels should display the same limiting behavior as the exact, frequency-dependent kernel evaluated at \(\omega =0\). Indeed, having a frequency-independent kernel, which is finite at large \(q\) (obeying Eq. (21)), will in fact lead to a pair-distribution function, which is singular at the origin.26,67,68

In Fig. 1, we show some approximate forms for \({f}_{{\rm{xc}}}^{{\rm{HEG}}}\), which have been proposed in the literature.27,69,70,71 The “rALDA” kernel will be discussed in some detail in the following sections. Briefly describing the other kernels, “CDOP” refers to the frequency-independent kernel proposed by Corradini, Del Sole, Onida, and Palummo,69 which has the same limiting behavior as the exact static kernel (Eqs. (19) and (21)). “CDOPs” refers to the kernel introduced in ref., 70 which modifies CDOP so that it vanishes at large \(q\). “CPd” refers to the dynamical kernel proposed by Constantin and Pitarke,71 which satisfies the long wavelength static and high frequency limits (Eqs. (19) and (22)). The frequency-independent “CP” kernel corresponds to the CPd kernel at \(\omega =0\).

Fig. 1
figure 1

Various approximations for the exchange-correlation kernels applied to the homogeneous electron gas as a function of wavevector, evaluated at a density corresponding to \({r}_{s}=2.0\) Bohr radii. The dynamical CPd kernel was evaluated at \(\omega\) = 2 Hartrees and the JGMs kernel was evaluated at a band gap of 3.4 eV. The trivial RPA case (\({f}_{xc}^{{\rm{HEG}}}=0\)) is also shown. Reprinted from ref., 77 with the permission of AIP Publishing

Although the HEG carries the advantage of being a very well-studied system, it is worth remembering that fundamentally it is metallic. The xc-kernel of a periodic insulator is known to display different limiting behavior to that of a metal, diverging as \(1/{q}^{2}\) in the \(q\to 0\) limit.72,73 This aspect is especially important in TDDFT calculations of optical spectra, including excitonic effects.74,75,76 This consideration led to the development of the frequency-independent jellium-with-gap model (JGM) kernel, which has the \(1/{q}^{2}\) divergence.74 The slightly simpler “JGMs” kernel shown in Fig. 1 is described in ref. 77 Here the band gap \({E}_{{\rm{g}}}\) enters parametrically. In the limit \({E}_{{\rm{g}}}\to \infty\), the correlation energy disappears, while \({E}_{{\rm{g}}}\to 0\) the metallic CP kernel is recovered.

Reference 77 provides a more thorough discussion of all of the kernels shown in Fig. 1, including the expressions used to evaluate them and their forms in real space. In the current work, we focus our attention on the renormalized adiabatic kernels rALDA and rAPBE, although a comparison of all the different xc-kernels for the structural parameters of solids will be presented in the section “Structural parameters.” It should also be emphasized that we will focus on the performance for total energy calculations in the following. The quality of commonly applied xc-kernels for excitations of the HEG have been studied by Tatarczyk et al.78

The rALDA kernel

Ideally, one should aim at obtaining a general approximation to \({f}_{{\rm{xc}}}\) that can reproduce various physical quantities such as optical absorption spectra and ground state electronic correlation energies. However, finding good approximations for \({f}_{{\rm{xc}}}\) is highly challenging and it is often necessary to limit the approximation to a given application. As mentioned previously, it is crucial to use a form of \({f}_{{\rm{xc}}}\) that has the correct \(1/{q}^{2}\) behavior in the long wavelength limit in order to capture excitonic effects in absorption spectra. On the other hand, ground state correlation energies involve \(q\)-space integrals making it extremely important to obtain a good approximation at large values of \(q\), whereas the long wavelength limit is less important. In the following, we will focus on obtaining an approximation that provides accurate ground state correlation energies.

The correlation energy per electron is directly related to the integral of the coupling constant averaged correlation hole \({\bar{n}}_{{\rm{c}}}(r)\)25

$${E}_{{\rm{c}}}=2\pi {\int\nolimits_{0}^{\infty}}{\rm{d}}rr{\bar{n}}_{{\rm{c}}}(r)=\frac{1}{\pi }{\int\nolimits_{0}^{\infty}}{\rm{d}}q{\bar{n}}_{{\rm{c}}}(q).$$
(23)

where \({\bar{n}}_{{\rm{c}}}(q)\) is the Fourier transform of \({\bar{n}}_{{\rm{c}}}(r)\). A parametrization of the exact \({n}_{{\rm{c}}}(q)\) has been provided by Perdew and Wang79 based on quantum Monte Carlo simulations of the HEG at various densities. Approximations for \({n}_{{\rm{c}}}(q)\) can be obtained from Eqs. (7) and (8) using the Lindhard function for \({\chi }_{0}(q,\omega )\).

The correlation hole in \(q\)-space is shown in Fig. 2 calculated with RPA and ALDA for two different densities. Compared to the exact parametrization, it is clear that RPA severely overestimates the magnitude of the correlation hole and the RPA will predict a correlation energy that is 0.5 eV too low per electron for a wide range of densities. The ALDA on the other hand straddles the exact parametrization for a wide range of \(q\)-values but decays too slowly at large \(q\) compared to the exact results. This is a consequence of the locality of the approximation, which translates into an independence of \(q\). At large \(q\), the xc-kernel will thus dominate the Coulomb kernel and fail to reproduce the exact limit (Eq. (21)). Since the total energy involves a \(q\)-space integral over all space, the slow decay of the correlation hole introduces significant errors and overestimates the correlation energy by \(\sim\hskip -2pt0.3\) eV per electron.

Fig. 2
figure 2

Correlation hole of the homogeneous electron gas in \(q\)-space at \({r}_{s}=1\) (left) and \({r}_{s}=10\) (right). Black lines are the exact results, blue lines are RPA, and green lines are ALDA. Reproduced with permission from ref., 27 copyright American Physical Society, 2012

The ALDA\({}_{{\rm{x}}}\) kernel provides a good approximation to the exact one for both low \({r}_{s}=1\) and high \({r}_{s}=10\) densities for \(q\ <\ 2{k}_{{\rm{F}}}\), where the correlation hole has a zero point in \(q\)-space. However, for \(q\ >\ 2{k}_{{\rm{F}}}\) the exact correlation hole largely vanishes and we expect to obtain a better approximation for the correlation energy if we simply truncate the \(q\)-integration at \(2{k}_{{\rm{F}}}\) when evaluating Eq. (23) using the ALDA\({}_{{\rm{x}}}\) approximation. We will refer to this scheme as renormalized ALDA\({}_{{\rm{x}}}\) (rALDA), since the truncation preserves the integral of the correlation hole in real space. The correlation energy per electron evaluated in this scheme is shown in Fig. 3 obtained with RPA, ALDA\({}_{{\rm{x}}}\), and rALDA. Evidently, the errors in the correlation energy obtained with rALDA are much smaller than both RPA and ALDA\({}_{{\rm{x}}}\). A similar analysis was carried out in ref. 77 for all the kernels shown in Fig. 1. All the kernels performed significantly better than RPA, but none was found to be as accurate as the rALDA.

Fig. 3
figure 3

Correlation energy per electron of the homogeneous electron gas evaluated with RPA, ALDA\({}_{{\rm{x}}}\), and rALDA\({}_{{\rm{x}}}\). Reproduced with permission from ref., 27 copyright American Physical Society, 2012

For the HEG, the truncation is equivalent to using the Hxc-kernel

$${f}_{{\rm{Hxc}}}^{{\rm{rALDA}}}[n](q)=\theta \left(2{k}_{{\rm{F}}}-q\right){f}_{{\rm{Hx}}}^{{\rm{ALDA}}}[n].$$
(24)

Fourier transforming this expression yields

$$\begin{array}{rcl}{f}_{{\rm{Hxc}}}^{{\rm{rALDA}}}[n](r)&=&{\widetilde{f}}_{{\rm{xc}}}^{{\rm{rALDA}}}[n](r)+{v}^{r}[n](r),\\ {\widetilde{f}}_{{\rm{xc}}}^{{\rm{rALDA}}}[n](r)&=&\frac{{f}_{{\rm{x}}}^{{\rm{ALDA}}}[n]}{2{\pi }^{2}{r}^{3}}\left[\sin (2{k}_{{\rm{F}}}r)-2{k}_{{\rm{F}}}r\cos (2{k}_{{\rm{F}}}r)\right],\\ {v}^{r}[n](r)&=&\frac{1}{r}\frac{2}{\pi }{\int\nolimits_{0}^{2{k}_{{\rm{F}}}r}}\frac{\sin x}{x}{\rm{d}}x.\end{array}$$
(25)

Since \({k}_{{\rm{F}}}\) is related to the density, one can attempt to generalize this scheme to inhomogeneous systems. We then take \(r\to | {\bf{r}}-{\bf{r}}^{\prime} |\) and \({k}_{{\rm{F}}}\to {(3{\pi }^{2}n({\bf{r}},{\bf{r}}^{\prime} ))}^{1/3}\), but there is no unique way to define the two-point density \(n({\bf{r}},{\bf{r}}^{\prime} )\). A natural choice is to take25\(n({\bf{r}},{\bf{r}}^{\prime} )=(n({\bf{r}})+n({\bf{r}}^{\prime} ))/2\), but other choices are possible, which will be discussed in section “Implementation.”

The simple truncation procedure has thus led to a non-local rALDA kernel that does not contain any free parameters and significantly improves correlation energies for homogeneous systems. We have split the Hxc-kernel into a renormalized xc part \({f}_{{\rm{xc}}}^{{\rm{rALDA}}}[n]\) and a renormalized Coulomb part \({v}^{r}[n]\). However, both terms depend on the density and contain xc effects. The \({\widetilde{f}}_{{\rm{xc}}}^{{\rm{rALDA}}}\) part can be regarded as an ALDA\({}_{{\rm{x}}}\) kernel where the delta function has acquired a density-dependent broadening, whereas \({v}^{r}\) is the Coulomb interaction reduced by a density and distance-dependent factor that approaches unity for large densities or distances. In fact, at large separation \({f}_{{\rm{Hxc}}}^{{\rm{rALDA}}}\) reduces to the pure Coulomb kernel and it is expected to retain the accurate description of long-range vdW interactions within RPA. For example, in a jellium with \({r}_{s}=2.0\) two points separated by 5 Å gives a renormalized Coulomb interaction \({v}^{r}[{r}_{s}=2](r)=0.97v(r)\) and the magnitude is a factor of 30 times larger than \({\widetilde{f}}_{{\rm{xc}}}^{{\rm{rALDA}}}\). Interestingly, both \({v}^{r}\) and \({\widetilde{f}}_{{\rm{xc}}}^{{\rm{rALDA}}}\) becomes finite at the origin giving

$${v}^{r}[n](r\to 0)=\frac{4{k}_{{\rm{F}}}}{\pi }-\frac{8{k}_{{\rm{F}}}^{3}{r}^{2}}{9\pi },$$
(26)
$${\widetilde{f}}_{{\rm{xc}}}^{{\rm{rALDA}}}[n](r\to 0)=\left[\frac{4{k}_{{\rm{F}}}^{3}}{3{\pi }^{2}}-\frac{32{k}_{{\rm{F}}}^{5}{r}^{2}}{15{\pi }^{2}}\right]{f}_{{\rm{x}}}^{{\rm{ALDA}}}[n],$$
(27)

which implies \({f}_{{\rm{Hxc}}}^{{\rm{rALDA}}}[n](r=0)=0\). This property is related to the fact that the position-weighted correlation hole entering the first integral in Eq. (23) vanishes at the origin28 and is highly convenient for numerical real-space evaluation of the kernel.

It is often more convenient to separate the Hxc-kernel into the exact Coulomb kernel and an xc-kernel and one is then led to define

$${f}_{{\rm{x}}}^{{\rm{rALDA}}}[n](r)={\widetilde{f}}_{{\rm{x}}}^{{\rm{rALDA}}}[n](r)+{v}^{r}[n](r)-v(r).$$
(28)

This expression is typically more useful for applications to periodic systems since \({v}^{r}[n](r)-v(r)\) is short ranged (\([{v}^{r}[n](r)-v(r)]\to \sin (2{k}_{{\rm{F}}}r)/r\) for \(r\to \infty\)), whereas both \({v}^{r}[n](r)\) and \(v(r)\) are long ranged.

Generalized truncation scheme

The truncation scheme defined above is easily generalized to any adiabatic semilocal local kernel. The correlation hole of the HEG calculated with ALDA\({}_{{\rm{x}}}\) has a zero point exactly at \(2{k}_{{\rm{F}}}\). This is not true in general, but for any adiabatic kernel the correlation hole becomes zero at the point where the Hxc-kernel vanishes. This leads to the zero-point wavevector

$${q}_{0}[n]=\sqrt{\frac{-4\pi }{{f}_{{\rm{xc}}}^{{\rm{A}}}[n]}},$$
(29)

where \({f}_{{\rm{xc}}}^{{\rm{A}}}[n]\) is the spatial Fourier transform of the adiabatic kernel

$${f}_{{\rm{xc}}}^{{\rm{A}}}[n]({\bf{r}}-{\bf{r}}^{\prime} )=\frac{\delta {v}_{{\rm{xc}}}({\bf{r}})}{\delta n({\bf{r}}^{\prime} )}\delta ({\bf{r}}-{\bf{r}}^{\prime} ),$$
(30)

corresponding to a particular semilocal approximation. Renormalized kernels for any semilocal approximation for the xc functional can then be defined by replacing to \(2{k}_{{\rm{F}}}\) by \({q}_{0}\) in Eq. (25) and the kernel is again generalized to inhomogeneous systems by taking \(r\to | {\bf{r}}-{\bf{r}}^{\prime} |\) and \(n\to n({\bf{r}},{\bf{r}}^{\prime} )\) in addition to a scheme that defines \(n({\bf{r}},{\bf{r}}^{\prime} )\) in terms of \(n({\bf{r}})\). For the generalized gradient-corrected functionals, \({q}_{0}\) will depend on the gradient of the density as well, which may lead to positive values of \({f}_{{\rm{xc}}}^{A}\) at certain points. At those points, we set \({q}_{0}[n]({\bf{r}},{\bf{r}}^{\prime} )=0\) in order to maintain a well-defined kernel. Below we will only consider rALDA and rAPBE.

Spin

The inclusion of spin degrees of freedom in RPA is almost trivial since the correlation energy involves the sum over all spin components \({\chi }_{\sigma \sigma ^{\prime} }\), which is obtained by the simple substitution \({\chi }^{0}\to {\chi }_{\uparrow \uparrow }^{0}+{\chi }_{\downarrow \downarrow }^{0}\) in Eq. (8). This is due to the fact that \({F}_{{\rm{Hxc}}}\) is independent of spin in RPA. In general, however, it is not straightforward to generalize a kernel for spin-paired systems to the spin-polarized case.

In the case of exchange, one can resort to the spin dependence of the exchange energy. In particular, one has

$${E}_{{\rm{x}}}[{n}_{\uparrow },{n}_{\downarrow }]=\frac{{E}_{{\rm{x}}}[2{n}_{\uparrow }]+{E}_{{\rm{x}}}[2{n}_{\downarrow }]}{2},$$
(31)

which yields

$${f}_{{\rm{x}},\sigma \sigma ^{\prime} }[{n}_{\uparrow },{n}_{\downarrow }]=2{f}_{{\rm{x}}}[2{n}_{\sigma }]{\delta }_{\sigma \sigma ^{\prime} }.$$
(32)

It is possible to enforce this condition on the rALDA kernel as well, but we have found that it makes the correlation energy difficult to converge. The reason is that the off-diagonal (in spin) components of the Hxc-kernel involves a bare Coulomb interaction, whereas the diagonal components lack a long-range cancellation between \(v(r)\) and \({v}^{r}[n]\).

This failure is clearly a limitation of the rALDA scheme and an additional approximation is required to maintain the accuracy of rALDA for spin-polarized systems. To this end, we start with the Dyson equation with explicit spin dependence

$${\chi }_{\sigma \sigma ^{\prime} }={\chi }_{\sigma }^{{\rm{KS}}}{\delta }_{\sigma \sigma ^{\prime} }+\sum _{\sigma ^{\prime\prime} }{\chi }_{\sigma }^{{\rm{KS}}}{f}_{\sigma \sigma ^{\prime\prime} }^{{\rm{Hxc}}}[{n}_{\uparrow },{n}_{\downarrow }]{\chi }_{\sigma ^{\prime\prime} \sigma ^{\prime} },$$
(33)

where it was used that \({\chi }^{KS}\) is diagonal in spin. For the spin-paired case, one has that

$$\frac{1}{4}\sum _{\sigma \sigma ^{\prime} }{f}_{\sigma \sigma ^{\prime} }^{{\rm{xc}}}[n/2,n/2]={f}_{{\rm{x}}}[n],$$
(34)

which will always hold if Eq. (32) is satisfied. To reintroduce a balanced expression for the renormalized kernel in each spin component, we relax Eq. (32) and use

$${f}_{{\rm{xc}},\sigma \sigma ^{\prime} }^{{\rm{rALDA}}}[{n}_{\uparrow },{n}_{\downarrow }]=2{f}_{{\rm{xc}}}^{{\rm{rALDA}}}[n]{\delta }_{\sigma \sigma ^{\prime} }+{v}^{r}[n]-v,$$
(35)

where \(n={n}_{\sigma }+{n}_{\sigma ^{\prime} }\). Eq. (34) is now satisfied, but Eq. (32) is not. This choice is not unique though and another choice is comprised by \({f}_{{\rm{x}},\sigma \sigma ^{\prime} }^{{\rm{rALDA}}}\ =\ 2{f}_{{\rm{x}}}^{{\rm{rALDA}}}[{n}_{\sigma }+{n}_{\sigma ^{\prime} }]{\delta }_{\sigma \sigma ^{\prime} }+{v}^{r}[{n}_{\sigma }+{n}_{\sigma ^{\prime} }]-v\), which was used in ref. 27 However, Eq. (35) appears to yield better results for atomization energies where spin-polarized isolated atoms are used as a reference.

Hedin’s equations and vertex corrections

So far, we have discussed the use of xc-kernels in the context of the ACFDT formula for the ground state correlation energy. However, it is possible, and in fact quite effective, to apply the same xc-kernels to describe the effect of vertex corrections in the electron self-energy. In 1965, Lars Hedin introduced a set of coupled equations relating the single-particle Green’s function \(G\), the electron self-energy, \({\sum}\), to the polarization, \(P\), the screened electron–electron interaction, \(W\), and the three-point vertex function \(\Gamma\),80

$$G(1,2)={G}_{{\rm{H}}}(12)+\int {\rm{d}}(34){G}_{{\rm{H}}}(1,3){\sum} (3,4)G(4,2)$$
(36)
$${\sum} (1,2)=i\int {\rm{d}}(34)G(1,3)\Gamma (3,2,4)W(4,{1}^{+})$$
(37)
$$W(1,2)=v(1,2)+\int {\rm{d}}(34)W(1,3)P(3,4)v(4,2)$$
(38)
$$P(1,2)=-i\int {\rm{d}}(3,4)G(1,3)G(4,{1}^{+})\Gamma (3,4,2)$$
(39)
$$\begin{array}{rcl}\Gamma (1,2,3)&=&\delta (1,2)\delta (1,3)\\ &+&\int {\rm{d}}(4567)\frac{\delta \Sigma (1,2)}{\delta G(4,5)}G(4,6)G(7,5)\Gamma (6,7,3),\end{array}$$
(40)

where we employed the notation \((1)=({{\bf{r}}}_{1},{t}_{1},{\sigma }_{1})\) and \({G}_{{\rm{H}}}\) is the Hartree Green’s function. The well-known and widely used GW approximation is obtained by iterating Hedin’s equations once starting from \(\sum =0\), i.e., the Hartree approximation. This produces the trivial vertex function \(\Gamma =\delta (1,2)\delta (1,3)\), which corresponds to the time-dependent Hartree approximation for the polarization \(P\), which is the approximation referred to as RPA in the present review. There are basically two issues with this approach. First of all, it starts from \({G}_{{\rm{H}}}\), which is known to be a poor approximation. Second, it neglects vertex corrections completely. In practice, the latter issue is rarely dealt with because of the complex nature of \(\Gamma\), while the first is overcome by following a “best \(G\), best \(W\)” philosophy.31 Within the popular G\({}_{0}\)W\({}_{0}\) method, one evaluates the self-energy from a non-interacting \({G}_{0}\) obtained from a DFT calculation while \(W\) is obtained within RPA using the polarization \({P}_{0}=-i{G}_{0}{G}_{0}\). Today, the G\({}_{0}\)W\({}_{0}\) method remains the state-of-the-art for calculation of QP band structures of inorganic solids33,34,35 and nano-structures, including two-dimensional materials.81,82,83

Another question related to Hedin’s equation is the role of self-consistency. In principle, the five equations should be solved self-consistently. However, while self-consistency improves the description of energy levels in molecules84,85 and is essential for systems out of equilibrium,86,87,88,89 it does not in general improve the band structure and spectral functions of solids when vertex corrections are neglected.90,91 The role of self-consistency will not be further discussed in this review where we instead concentrate on the problems of vertex corrections.

Rather than starting the iterative solution of Hedin’s equations with \(\sum =0\) (which leads to the GW approximation at first iteration), it is of course possible to start with a local approximation, \({\Sigma }^{0}(1,2)=\delta (1,2){v}_{{\rm{xc}}}(1)\). As shown by Del Sole et al.,38 this leads to a self-energy of the form

$$\sum (1,2)=iG(1,2)\widetilde{W}(1,2),$$
(41)

where

$$\widetilde{W}=v{[1-{P}_{0}(v+{f}_{{\rm{xc}}})]}^{-1}$$
(42)

and

$${f}_{{\rm{xc}}}(1,2)=\delta {v}_{{\rm{xc}}}(1)/\delta n(2)$$
(43)

is the adiabatic xc-kernel. By inspection, it becomes clear that \(\widetilde{W}(1,2)\) is the screened effective potential generated at point 2 by a charge at point 1. This potential consists of the bare Coulomb potential plus the induced Hartree and xc-potential. Consequently, it represents the potential felt by an electron. In contrast, the potential felt by a classical test charge is the bare potential screened only by the induced Hartree potential:

$$W=v+v{[1-{P}_{0}(v+{f}_{{\rm{xc}}})]}^{-1}{P}_{0}v$$
(44)

In Eq. (41), the replacement of \(\widetilde{W}\) by \(W\) thus corresponds to including the vertex in the polarizability, \(P\), but neglecting it in the self-energy. In the following, we refer to these two alternative schemes as G\({}_{0}\)W\({}_{0}{\Gamma }_{0}\) and G\({}_{0}\)W\({}_{0}\)P\({}_{0}\), respectively. As usual, the subscripts indicate that the quantities are evaluated non-self-consistently starting from DFT. In addition to the fact that the vertex correction accounts for the change in the xc-potential and therefore should be more accurate, an attractive feature of the G\({}_{0}\)W\({}_{0}{\Gamma }_{0}\) scheme is that DFT becomes the consistent starting point for non-self-consistent calculations when the relation (43) is satisfied. This is in stark contrast to G\({}_{0}\)W\({}_{0}\) for which the Hartree approximation is the consistent starting point.

In section “Vertex corrected quasiparticle energies,” we show that when the rALDA xc-kernel is used to include vertex corrections through Eq. (42) the improved description of short-range correlations lead to a significant upshift of QP energies by 0.3–0.5 eV in agreement with experiments.30 Since both occupied and unoccupied states are shifted up, the band gap is not affected as much but a small increase is generally observed again in agreement with experiments.

Implementation

Evaluating non-local kernels for inhomogeneous densities

Kernels like the rALDA, which were derived from the HEG (a uniform system), have the form \({f}_{{\rm{xc}}}^{m}[n](q,\omega )\) in reciprocal space or \({f}_{{\rm{xc}}}^{m}[n](| {\bf{r}}-{\bf{r}}^{\prime} | ,\omega )\) in real space. As mentioned in section “Generalized truncation scheme”, this nonlocality \(| {\bf{r}}-{\bf{r}}^{\prime} |\) leads to a question regarding the treatment of the density argument when calculating the correlation energy of inhomogeneous systems [\(n=n({\bf{r}})\)]. To illustrate this point more clearly, we consider the plane wave representation of the kernel,

$${f}_{{\rm{xc}}}^{{\bf{G}}{\bf{G}}^{\prime} }({\bf{q}},\omega )=\frac{1}{V}{\int\nolimits_{V}}{\rm{d}}{\bf{r}}{\int\nolimits_{V}}{\rm{d}}{\bf{r}}^{\prime} {e}^{-i({\bf{q}}+{\bf{G}})\cdot {\bf{r}}}{f}_{{\rm{xc}}}({\bf{r}},{\bf{r}}^{\prime} ,\omega ){e}^{i({\bf{q}}+{\bf{G}}^{\prime} )\cdot {\bf{r}}^{\prime} }.$$

Here \(V\) is the volume of the crystal, \({\bf{G}}\) and \({\bf{G}}^{\prime}\) are reciprocal lattice vectors, and \({\bf{q}}\) lies within the first Brillouin zone. In the case that the system under investigation is homogeneous [\(n({\bf{r}})={n}_{0}\)], then the kernel is diagonal,

$${f}_{{\rm{xc}}}^{{\bf{G}}{\bf{G}}^{\prime} }({\bf{q}},\omega )={\delta }_{{\bf{G}}{\bf{G}}^{\prime} }{f}_{{\rm{xc}}}^{m}[{n}_{0}](| {\bf{q}}+{\bf{G}}| ,\omega )$$
(45)

On the other hand, if the kernel is fully local (independent of \(q\), e.g., the ALDA) it is natural to use the local density to evaluate the kernel, obtaining

$${f}_{{\rm{xc}}}^{{\bf{G}}{\bf{G}}^{\prime} }(\omega )=\frac{1}{\Omega }{\int\nolimits_{\Omega }}{\rm{d}}{\bf{r}}{e}^{-i({\bf{G}}-{\bf{G}}^{\prime} )\cdot {\bf{r}}}{f}_{{\rm{xc}}}^{m}[n({\bf{r}})](\omega )$$
(46)

where \(\Omega\) is the unit cell volume. However, for the general case of a inhomogeneous system and non-local kernel there is no unique way of constructing \({f}_{{\rm{xc}}}[n]({\bf{r}},{\bf{r}}^{\prime} )\) from the knowledge of \({f}_{{\rm{xc}}}[n]({\bf{r}})\) in a homogeneous system. The problem is that the density is a one-point function and it is not clear how to treat the dependence of \({\bf{r}}\) and \({\bf{r}}^{\prime}\). One important constraint is that the resulting kernel must be symmetric in \({\bf{r}}\) and \({\bf{r}}^{\prime}\),92 and in the following we assume the form

$${f}_{{\rm{xc}}}({\bf{r}},{\bf{r}}^{\prime} )\to {f}_{{\rm{xc}}}^{m}({\mathcal{S}}[n],| {\bf{r}}-{\bf{r}}^{\prime} | ),$$
(47)

where \(\mathcal{S}\) is a functional of the density symmetric in \({\bf{r}}\) and \({\bf{r}}^{\prime}\) and we have restricted ourselves to frequency-independent kernels.

Before continuing, we note that if a particular problem is characterized by certain symmetries it may be advantageous to use a basis that conforms with the symmetry instead of using plane waves. For example, in ref. 93 the correlation energy of spherical clusters was studied using an expansion of the response function in terms of radial functions and spherical harmonics. Such decomposition may significantly reduce the computational cost.

Density symmetrization

The density symmetrization scheme used in the rALDA/rAPBE calculations in refs 27,29,94 employed a two-point average,

$${\cal{S}}[n]=[n({\bf{r}})+n({\bf{r}}^{\prime} )]/2,$$
(48)

but more elaborate functionals are possible.95,96 A kernel satisfying Eq. (47) with a general two-point density is only invariant under simultaneous lattice translation in \({\bf{r}}\) and \({\bf{r}}^{\prime}\). Its plane wave representation can then be written in the form

$${f}_{{\rm{xc}}}^{{\bf{G}}{\bf{G}}^{\prime} }({\bf{q}})=\frac{1}{\Omega }{\int\nolimits_{\Omega}}{\rm{d}}{\bf{r}}{\int\nolimits_{\Omega}}{\rm{d}}{\bf{r}}^{\prime} {e}^{-i{\bf{G}}\cdot {\bf{r}}}f({\bf{q}};{\bf{r}},{\bf{r}}^{\prime} ){e}^{i{\bf{G}}^{\prime} \cdot {\bf{r}}^{\prime} }$$
(49)

where

$$f({\bf{q}};{\bf{r}},{\bf{r}}^{\prime} )=\frac{1}{N}\sum _{i,j}{e}^{i{\bf{q}}\cdot {{\bf{R}}}_{ij}}{e}^{-i{\bf{q}}\cdot ({\bf{r}}-{\bf{r}}^{\prime} )}{f}_{{\rm{x}}}^{{\rm{xc}}}({\bf{r}},{\bf{r}}^{\prime} +{{\bf{R}}}_{ij}).$$
(50)

and \({{\bf{R}}}_{ij}={{\bf{R}}}_{i}-{{\bf{R}}}_{j}\). \(f({\bf{q}};{\bf{r}},{\bf{r}}^{\prime} )\) is periodic in \({\bf{r}}\) and \({\bf{r}}^{\prime}\) and Eq. (49) must be converged by unit cell sampling, which should typically match the \(k\)-point sampling in periodic systems.

Kernel symmetrization

A second approach symmetrizes the kernel itself.70 Starting from a non-symmetric kernel,

$${f}_{{\rm{xc}}}^{{\rm{NS}}}({\bf{r}},{\bf{r}}^{\prime} ,\omega )={f}_{{\rm{xc}}}^{m}[n({\bf{r}})](| {\bf{r}}-{\bf{r}}^{\prime} | ,\omega )$$
(51)

and inserting into Eq. (45) gives

$$\left.{f}_{{\rm{xc}}}^{{\rm{NS}},{\bf{G}}{\bf{G}}^{\prime} }({\bf{q}},\omega )=\frac{1}{\Omega }{\int\nolimits_{\Omega}}{\rm{d}}{\bf{r}}{e}^{-i({\bf{G}}-{\bf{G}}^{\prime} )\cdot {\bf{r}}}{f}_{{\rm{xc}}}^{m}[n({\bf{r}})]\right)(| {\bf{q}}+{\bf{G}}| ,\omega ).$$
(52)

It is now possible to obtain a symmetric kernel by taking the average \({f}_{{\rm{xc}}}^{{\rm{NS}},{\bf{G}}{\bf{G}}^{\prime} }({\bf{q}},\omega )\) and its Hermitian conjugate,

$${f}_{{\rm{xc}}}^{{\rm{S}},{\bf{G}}{\bf{G}}^{\prime} }({\bf{q}},\omega )=\frac{1}{2}\left({f}_{{\rm{xc}}}^{{\rm{NS}},{\bf{G}}{\bf{G}}^{\prime} }({\bf{q}},\omega )+{[{f}_{{\rm{xc}}}^{{\rm{NS}},{\bf{G}}^{\prime} {\bf{G}}}({\bf{q}},\omega )]}^{* }\right)$$
(53)

which can be seen equivalently as inserting the two-point average \(1/2[{f}_{{\rm{xc}}}^{{\rm{NS}}}({\bf{r}},{\bf{r}}^{\prime} ,\omega )+{f}_{{\rm{xc}}}^{{\rm{NS}}}({\bf{r}}^{\prime} ,{\bf{r}},\omega )]\) into Eq. (45).70 Compared to density symmetrization, Eq. (52) has the advantage that the integral is performed over one unit cell only and that only the density has to be represented on a real space grid, while the kernel is defined by its plane wave representation.

Wavevector symmetrization

The third approach we consider is that employed in ref., 74 which retains the computational advantages of the kernel symmetrization scheme. Here the wavevector \(| {\bf{q}}+{\bf{G}}|\) entering Eq. (52) is replaced by the symmetrized quantity \(\sqrt{| {\bf{q}}+{\bf{G}}| | {\bf{q}}+{\bf{G}}^{\prime} | }\), such that

$${f}_{{\rm{xc}}}^{{\bf{G}}{\bf{G}}^{\prime} }({\bf{q}},\omega )=\frac{1}{\Omega }{\int \nolimits_{\Omega}}{\rm{d}}{\bf{r}}{e}^{-i({\bf{G}}-{\bf{G}}^{\prime} )\cdot {\bf{r}}}{f}_{{\rm{xc}}}^{m}[n({\bf{r}})]\left(\sqrt{| {\bf{q}}+{\bf{G}}| | {\bf{q}}+{\bf{G}}^{\prime} | },\omega\right).$$
(54)

A kernel constructed using Eq. (54) will automatically satisfy the symmetry requirement of Eq. (47). Furthermore, for the specific case of kernels based on the jellium-with-gap model, the head and wings of the matrix in \({\bf{G}}\) and \({\bf{G}}^{\prime}\) have the correct \(1/{q}^{2}\) and \(1/q\) divergences, respectively.74

The main drawback of Eq. (54) is that, compared to a two-point average of the density or the kernel, the procedure of symmetrizing the wavevector is not physically transparent. Of course, two-point schemes also suffer from limitations (e.g., the kernel has no knowledge of the medium between \({\bf{r}}\) and \({\bf{r}}^{\prime}\)). The fact that we have to invoke any averaging system at all is an undesirable consequence of using kernels derived from the HEG to describe inhomogeneous systems. In what follows, we present calculations using both the density and wavevector symmetrization schemes.

Computational details and convergence

The kernels described in previous sections have been implemented in the DFT code GPAW,97,98 which uses the projector augmented wave (PAW) method.99 The calculation of correlation energies in the framework of the ACFDT are performed in four steps. (1) A standard LDA or PBE calculation is carried out in a plane wave basis. (2) The full plane wave KS Hamiltonian is diagonalized to obtain all unoccupied electronic states and eigenvalues. (3) A plane wave cutoff energy is chosen and the KS response function100 is calculated by setting the number of unoccupied bands included in the sum equal to the number of plane waves defined by the cutoff. (4) The correlation energy is evaluated according to Eqs. (7) and (8). The calculated correlation energies are finally added to non-self-consistent Hartree–Fock energies evaluated on the same orbitals as the correlation energy. The coupling constant integration is evaluated using 8 Gauss–Legendre points and the frequency integration is performed with 16 Gauss–Legendre points. For the frequency integration, the Gauss–Legendre points chosen in the interval \(x\in [-1,1]\) are transformed to \(\left[0,\infty \right[\) by the function \(\omega \propto -\mathrm{log}(1/2+x/2)\) in such a way that the integral of \(f(\omega )={\omega }^{1/B-1}\exp (-\alpha {\omega }^{1/B})\) is reproduced exactly.94 \(\alpha\) is then determined by the position of the highest frequency point, which we situate at 800 eV and \(B\) determines the density of frequency points at low energies. For the calculations below, we have typically used \(B=2\) for insulators and \(B=2.5\) for metals.

The main convergence parameter for these calculations is thus the plane wave cutoff energy (\({E}_{{\rm{cut}}}\)) used for the response function and kernel. In the case of RPA calculations, it has been shown that for sufficiently high cutoff energies the correlation energy scales as10,94

$${E}_{{\rm{c}}}({E}_{{\rm{cut}}})={E}^{{\rm{c}}}+\frac{A}{{E}_{{\rm{cut}}}^{3/2}}$$
(55)

and it is thus possible to perform an accurate extrapolation to the converged results from a few calculations at low cutoff energies. When the ACFDT method is used with a kernel, the extrapolation (55) is less accurate, but the calculations often converge much faster than RPA such that extrapolation is either not needed at all or only introduces small errors due to the proximity of the result at the highest cutoff to the converged result. As an example, we show the correlation energy of bulk Na in Fig. 4 calculated with RPA, ALDA, and rALDA. It is expected that the correlation energy should resemble that of a HEG with the average valence density of Na due to the delocalized valence electrons in Na.7 The rALDA calculations are rapidly converged with respect to unit cell sampling (two nearest unit cells are sufficient) and the results are shown in Fig. 4 as a function of plane wave cutoff energy along with the RPA and ALDA results. Similarly to the HEG, we find that RPA significantly underestimates the correlation energy while ALDA\({}_{{\rm{x}}}\) overestimates it. We also note the very slow convergence of the ALDA calculation with respect to plane wave cutoff due to the \(q\)-independent kernel.

Fig. 4
figure 4

Correlation energy of the valence electron in Na evaluated with RPA, ALDA, and rALDA. The dashed lines show the values obtained with the functionals for the homogeneous electron gas using the average valence density of Na. Reproduced with permission from ref., 27 copyright American Physical Society, 2012

In a plane wave representation, the non-local kernels considered in the present work takes the form of Eq. (49). While the response function is calculated within the full PAW framework, it is not trivial to obtain the PAW corrections for a non-local functional. However, since the ALDA kernel vanishes for large densities, the non-local kernels considered in the present work tend to be small in the vicinity of the nuclei where it is usually difficult to represent the density accurately. As a consequence, the kernels are rather smooth—even at the points where the density is non-analytical—and the kernel can be evaluated from the all-electron density (obtained from the PAW method) represented on a uniform real-space grid using Eqs. (49) and (50). This is illustrated in Fig. 5 where the correlation energy of an N\({}_{2}\) molecule is shown as a function of grid spacing. The energy difference (contribution to the atomization energy) converges rapidly and is accurate to within 10 meV at 0.17 Å. For the calculations in the present work, the grid spacing was determined by the plane wave cutoff as \(h=\pi /\sqrt{4{E}_{{\rm{cut}}}}\) and a plane wave cutoff of 600 eV for the initial DFT calculations typically produces a grid spacing, which is \(h \sim 0.16\) Å.

Fig. 5
figure 5

Convergence of the valence electron contribution to the rALDA correlation energy with respect to grid spacing for the atomization energy of N\({}_{2}\). All numbers are relative to their converged values. Reproduced with permission from ref., 28 copyright American Physical Society, 2013

Results

Absolute correlation energies

We have already shown that the RPA underestimates the correlation energy of the HEG by 0.6–0.3 eV/electron, whereas the ALDA overestimates the correlation energy by 0.3 eV/electron compared to the RPA. This is significantly improved by the rALDA functional, which gives an error of <0.05 eV (See Fig. 3). In Table 1, we show that this trend remains true for simple atoms and molecules. For the H atom, the RPA gives a correlation energy of −13 kcal/mol (−0.56 eV), whereas ALDA gives 6 kcal/mol (0.26 eV). rALDA on the other hand gives −2 kcal/mol, which is a factor of three better than ALDA and a factor 6 better than RPA. A similar picture emerges from the correlation energy of H\({}_{2}\) and the He atom.

Table 1 Correlation energies of H, H\({}_{2}\), and He evaluated with different functionals

Although the RPA is free from self-interaction (cancellation of Hartree and exchange for single electron systems), it still contains a large amount of self-correlation, which originates from the fact that the Hartree kernel is not balanced by an exchange kernel in the ACFDT formalism. The self-correlation can be canceled exactly by including SOSEX13,17 or an exact exchange kernel.21,22 However, these approaches are far more computationally demanding than the ACFDT formalism. In contrast the rALDA kernel has a similar computational cost as RPA and reduces the self-correlation error to <0.1 eV for an H atom. The remaining error in rALDA is largely due to the choice of the LDA functional as the starting point and choosing the rAPBE kernel instead reduces the error to <1 meV.

In Fig. 6, we show the correlation energy per valence electron calculated for a number of solids using various xc-kernels and the RPA.77 As for the molecules and HEG, the RPA correlation energy is consistently larger (by a few tenths of an eV/electron) than that calculated using the xc-kernels. However, the differences in correlation energy between the kernels themselves are much smaller. For instance, including the correlation contribution of the ALDA in the rALDA kernel (rALDAc) increases the correlation energy by around just 1% (\(\sim\)0.01 eV/electron) compared to the standard rALDA.

Fig. 6
figure 6

Absolute correlation energy per valence electron calculated with different xc-kernels.77 Also shown is the correlation energy for Si obtained from diffusion Monte Carlo (DMC) calculations in ref. 102 Reprinted from ref., 77 with the permission of AIP Publishing

Figure 6 also shows the result of diffusion Monte Carlo (DMC) calculations of the correlation energy of Si,102 which we can tentatively compare to our own results. The DMC correlation energy lies among the values calculated from the kernels. Reference 70 also found a correlation energy close to the DMC value using the CDOP kernel and a different averaging scheme. This result is of course reassuring, but we note that care must always be exercised when making such comparisons. First, one must expect the calculated value to have some dependence on the treatment of the core–valence interaction (e.g., PAW, pseudopotentials, all-electron). More generally, the concept of the correlation energy “per valence electron” becomes less well defined if the calculated correlation energy includes the contribution of semicore states. For instance, comparing Figs. 6 and 4 reveals an apparent contradiction, that the valence energy per electron of Na in Fig. 6 is apparently approximately half its value in Fig. 4. This discrepancy is resolved, however, by noting that, in the calculations of Fig. 6,77 the entire 2\({s}^{2}\)2\({p}^{6}\) shell of Na was included as valence in addition to the 3\(s\) electron, unlike in Fig. 4. Therefore, the correlation energy of Na reported in Fig. 6 represents an average over the free-electron-like \(3s\) electron and the more localized \(2{s}^{2}2{p}^{6}\) shell, which cannot be straightforwardly compared to the free electron gas as in Fig. 4.

It is remarkable that the amount of self-correlation introduced by RPA is similar for widely different systems and it indicates that there will be a large energy cancellation when considering energy differences. As we will see below, this is true to some extent, but the cancellation is far from perfect and RPA gives rise to systematic errors in cohesive energies of solids and atomization energies of molecules.

Effect of kernel averaging scheme

As discussed in section “Evaluating non-local kernels for inhomogeneous densities,” there is a choice in how one constructs the xc-kernel for an inhomogeneous system. For instance, the correlation energies displayed in Table 1 were calculated using a two-point average of the density. If we repeat the rALDA calculations using the symmetrized wavevector scheme (Eq. (54)), we obtain correlation energies of 6, \(-24\), and \(-23\) kcal/mol for H, H\({}_{2}\) and He, respectively, i.e., a difference of +8, +4, and +4 kcal/mol compared to the values of Table 1. The small differences in H and H\({}_{2}\) cancel when calculating the atomization energy.77 We find it encouraging that the symmetrized wavevector approach agrees very well with the more intuitive two-point density average when calculating the atomization energy.

Structural parameters

Having established that the renormalized kernels yield greatly improved absolute correlation energies, we now consider physical observables, starting with lattice constants and bulk moduli of a test set of 10 crystalline solids. In these calculations, the KS states and energies obtained self-consistently within the LDA were used to calculate the noninteracting response function and exact exchange contribution to the total energy. A number of different xc-kernels (including the rALDA) were used to calculate the response function and correlation energy through Eqs. (7) and (8). The wavevector symmetrization scheme (Eq. (54)) was used to construct the kernel. The lattice constants and bulk moduli were extracted by calculating the total energy as a function of lattice spacing and fitting the results to a Birch–Murnaghan equation of state. More computational details are given in ref. 77

Figure 7 shows the percentage deviation between the calculated structural parameters and the experimental data listed in ref. 6 Calculations performed within ground state DFT within the LDA or GGA are also shown for comparison, which show the well-known tendency for the LDA/GGA to over/underbind, respectively. Using exact exchange and the ACFDT correlation energy (with any kernel, or the RPA) systematically improves the agreement with experiment, going from a mean absolute error (MAE) of 1.3%/7% for PBE to \(\le\)0.7%/4% for lattice constants/bulk moduli, respectively.

Fig. 7
figure 7

Lattice constants (left) and bulk moduli (right) calculated using different approximations for the xc-kernel for a test set of 10 crystalline solids, compared to the experimental values listed in ref. 6 The experimental lattice constants were corrected for zero-point motion.6 Note the CP and JGMs kernels coincide for metallic systems.77 Reprinted from ref., 77 with the permission of AIP Publishing

The difference between the various kernels, and even the RPA, is rather small. In particular, the rALDA, rALDAc, CDOPs, and CP kernels yield very similar results. The very close agreement between rALDA and rALDAc supports the use of the simpler rALDA kernel, which uses only the exchange part of the ALDA (Eq. (25)). One attractive property of the rALDA is that the calculated values display the fastest convergence with respect to the number of plane waves used to construct the response function, allowing a saving in computational time.

In terms of the other kernels, the strongest outlier is the JGMs kernel, particularly for the ionic solid LiCl. The agreement with experiment for the JGMs lattice constants can be improved even further by replacing the experimental optical gap \({E}_{{\rm{g}}}\) that appears in the kernel definition with an effective gap inspired by excitonic calculations involving a long-range-corrected attractive kernel.75,77 One can also see that the CDOP kernel produces respectable structural parameters, despite it actually having a divergent pair-distribution function, while the variation between the CP and CPd kernels illustrate the potential importance of dynamical effects.

However, the overall differences between all of the kernels are rather small, and based on these calculations, it is hard to argue that there are particular benefits in going beyond a simple, static kernel that tends to a density-dependent constant at small \(q\) and decays as \(1/{q}^{2}\) at large \(q\). The rALDA satisfies these properties and also carries the particular advantage of scaling simply with the coupling constant \(\lambda\). Finally, a significant strength of the rALDA is that, unlike the other kernels derived from the HEG, it has a spin-dependent generalization (section “Spin”), which is essential when calculating molecular atomization energies.

Atomization energies of molecules

In Fig. 8, we compare the performance of LDA, PBE, RPA@LDA, RPA@PBE, rALDA, and rAPBE for the atomization energies of 14 small molecules using the experimental atomic positions. All numbers are shown relative to experimental values corrected for non-adiabatic effects.103 RPA is seen to systematically underestimate the binding energies with RPA@LDA being slightly worse than RPA@PBE. In contrast, LDA and PBE overestimate binding energies and the performance of PBE is similar to RPA, whereas LDA is much worse. Both rALDA and rAPBE provides a significant improvement over RPA. rAPBE performs slightly better than rALDA, which is most likely due to the poor description of the ground state within LDA compared to PBE.

Fig. 8
figure 8

Molecular atomization energies evaluated with different methods shown relative to the experimental values. Results are shown with respect to reference values from ref. 103 The numbers are tabulated in Supplementary Material of ref. 29 Reproduced with permission from ref., 29 copyright American Physical Society, 2014

In Fig. 9, we show the mean absolute percentage error (MAPE) of RPA, rALDA, and rAPBE compared with that obtained with PBE0 as well as SOSEX and rP2T, which constitute two other beyond RPA methods.104,105 rALDA and rAPBE are a factor of three more accurate than RPA@LDA and RPA@PBE, respectively. Moreover, the rAPBE MAPE is <1.5% and outperforms both PBE0 and r2PT on this small test set.

Fig. 9
figure 9

Mean absolute percentage deviation of molecular atomization energies. The PBE0, SOSEX, and rP2T values are taken from ref. 20 Reproduced with permission from ref., 29 copyright American Physical Society, 2014

Cohesive energies of solids

While hybrid functionals may provide a computationally cheap way of obtaining accurate ground state energies for atoms and molecules, they typically fail dramatically for solids. Moreover, quantum chemistry methods are prohibitively demanding for solid-state systems and DFT and DFT-based methods currently seem to be the only possible choice when dealing with solids. In ref., 6 it was shown that RPA performs somewhat worse than PBE for the cohesive energies of solids, although it does provide significantly better results than LDA. This is in contrast to the case of molecules where RPA yields slightly better results than PBE. The reason is that the accuracy of RPA for atomization energies crucially depends on error cancellation of the ubiquitous self-correlation in RPA. The cancellation of errors is likely to work better when comparing similar systems, but for the cohesive energy of solids one has to consider ground state energies of atoms with ground state energies of solids, so the error cancellation can become more inaccurate. Since the rALDA and rAPBE functionals to a large extent eliminate the self-correlation error of RPA, it is expected that these approaches should perform significantly better than RPA.

In Fig. 10, we show the cohesive energies of solids calculated with LDA, PBE, RPA, rALDA, and rAPBE. Again the rAPBE functional performs significantly better than either PBE or RPA. MAPE is shown in Fig. 11 and the rAPBE deviation from experiment is <2, whereas PBE, RPA@LDA, and RPA@PBE give errors of 4%, 9%, and 7%, respectively. PBE0 gives a mean error of 7.5%, which is four times worse than rAPBE.

Fig. 10
figure 10

Deviation from experimental values of the cohesive energy of solids evaluated with different methods. The numbers are tabulated in Supplemental Material of ref. 29 Reproduced with permission from ref., 29 copyright American Physical Society, 2014

Fig. 11
figure 11

Mean absolute percentage deviation of cohesive energies of solids evaluated with six different methods. Reproduced with permission from ref., 29 copyright American Physical Society, 2014

Formation energies of metal oxides

In the previous two sections, we considered the problems of calculating the atomization energies of molecules and solids. This problem gauges the ability of a method to describe the absolute energy cost of breaking a chemical bond. In most practical situations, however, it is often more relevant to consider the material’s formation energy, i.e., its energy relative to the standard states of its constituent elements rather than the isolated atoms. The calculation of formation energies thus gauges the ability of a method to describe the energy of one type of chemical bond relative to another. Predicting the heat of formation of metal oxides has proven to be particularly challenging for a wide range of commonly applied xc-functionals. The RPA has previously been shown to significantly improve the accuracy of calculated formation energies of group I and II metal oxides as compared to semilocal functionals.106 In the following we briefly assess the performance of the rAPBE method and compare it to that of the PBE, RPA, and the Bayesian error estimation functional with vdW (BEEF-vdW) functional. The latter two contain non-local correlation to account for vdW interactions.107

The formation energy per oxygen atoms was obtained from the computed total energies as

$$\Delta E{\rm{O}}=\frac{1}{y}E[{{\rm{M}}}_{x}{{\rm{O}}}_{y}]-\frac{x}{y}E[{\rm{M}}]-\frac{1}{2}E[{{\rm{O}}}_{2}],$$
(56)

where \(E[{{\rm{M}}}_{x}{{\rm{O}}}_{y}]\), \(E[{\rm{M}}]\), and \(E[{{\rm{O}}}_{2}]\) are the total energies of the oxide, the bulk metal, and the O\({}_{2}\) molecule in the gas phase, respectively. Zero-point energy contributions were not included in the present study as previous work has shown that they affect the formation energies of oxides by <0.01 eV.106

The formation energies computed with PBE, BEEF-vdW, EXX, RPA, and rAPBE are summarized in Fig. 12. For the latter three methods, single-particle wave functions and energies were obtained from a self-consistent PBE calculation. All structures were optimized with PBE. The BEEF-vdW was included here to compare the performance of RPA and rAPBE methods to a semiempirical method that explicitly includes dispersive interactions. The mean signed error (MSE), MAE, and MAPE with respect to experiment are shown Table 2. Comparing the MSE and MAE shows that formation energies from PBE, BEEF-vdW, EXX, and RPA have a clear systematic tendency to overestimate formation energies; that is, oxides are predicted to be less stable than found in experiments (see ref. 106 for references for the experimental data). In contrast, while rAPBE shows the same tendency of destabilizing the metal oxide, it is less pronounced, and CaO\({}_{2}\), KO\({}_{2}\), CsO\({}_{2}\), and RuO\({}_{2}\) are in fact predicted to be more stable than experiment. The rAPBE method is in better agreement with experiments than all the other methods with a MAE of only 0.21 eV compared to 0.38 eV for RPA and 0.40 for BEEF-vdW. We can partly attribute the failure of the RPA to the lack of error cancellation between the correlation energy of the oxide and the bulk metal and oxygen molecule, which are all separately underestimated by the RPA (see ref. 108). The errors of the DFT xc-functionals and the RPA are to some extent systematic and can be ascribed to a bad description of the O\({}_{2}\) molecule. In fact, treating the energy of the O\({}_{2}\) reference as a fitting parameter, the MAE for all the methods become comparable and lie in the range 0.15–0.2 eV/O.108

Fig. 12
figure 12

Calculated oxide formation energy per oxygen atom using PBE, BEEF-vdW, RPA, and rAPBE plotted against the experimental data. The data set contains 19 group I and II metal oxides as well as the transition metal oxides TiO\({}_{2}\) and RuO\({}_{2}\). Reproduced with permission from ref., 108 copyright American Physical Society, 2015

Table 2 Mean signed error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) of calculated formation energies relative to experiments for 19 group I and II oxides and the transition metal oxides TiO\({}_{2}\) and RuO\({}_{2}\)

Surface and adsorption energies

For applications of DFT to problems in surface science, in particular heterogeneous catalysis and electrocatalysis, the ability to predict stability and reactivity of metal surfaces is of crucial importance. It has been established that the RPA yields very good results for surface energies and chemisorption energies of atoms and small molecules on transition metal surfaces and greatly improves the accuracy of the xc-functionals.109,110,111,112,113 As shown below, the good performance of RPA for surface and adsorption energies is preserved and probably even improved by the renormalized kernels. The difference between RPA and rALDA for surface reaction energies is of the order 5–10%, which is comparable to the difference found for atomization energies of solids and molecules. This suggests that the better description of short range correlations by the kernel, which was found to improve atomization energies in sections “Atomization energies of molecules” and “Cohesive energies of solids,” carries over to metal–molecule bonding. However, owing to the significantly smaller magnitude of such bond energies compared to covalent bonds in solids/molecules and the lack of experimental data with sub-100 meV accuracy, it is not possible to confirm this hypothesis directly.

Figure 13 shows the adsorption energy of CO on Pt(111) for a coverage of 1/4 plotted against the surface energy. Results obtained with RPA, rAPBE, and a range of xc-functionals are shown together with the experimental results provided in ref. 109 It is evident that the RPA and rAPBE methods are able to break the incorrect correlation between adsorption energy and surface energy exhibited by the xc-functionals109 and thereby yield an accurate description of both quantities simultaneously. Similar conclusions have recently been demonstrated for other adsorbates and surfaces.114

Fig. 13
figure 13

Surface energy versus adsorption energy of CO/Pt(111) calculated with various GGA functionals (green markers) and van der Waals functionals (red markers). Circles and triangles indicate atop and hollow sites, respectively. All calculations were performed with the experimental lattice constant of Pt and the CO molecule relaxed with PBE. The hollow circle was obtained with a PBE optimized lattice constant. The coverage of CO is 1/4 and the Pt surface was modeled by a slab containing four atomic layers. Reproduced with permission from ref., 29 copyright American Physical Society, 2014

Table 3 reports reaction energies in the full coverage limit for the set of benchmark surface reactions introduced in ref. 115 over the early 3\(d\) transition metal surfaces. The absolute and relative deviation from the rALDA result is shown in the last two rows. It is interesting to note that the xc-functional yielding the best agreement with the rALDA (and RPA) is the RPBE, which is a revised version of the PBE fitted to experimental data on surface reactions like the ones considered here.116 The rALDA and RPA results are in good agreement, with the largest deviation occurring for OH adsorption (difference of 0.11 eV). This agrees well with results of Fig. 13 for CO adsorption on Pt(111) and for graphene adsorbed on Ni(111) (see Fig. 16). Although the absolute deviation between rALDA and RPA for the surface reactions is small (MAE of 40 meV), the relative deviation is in fact similar to that obtained for the atomization energies of molecules and solids. However, it is interesting to note that RPA shows a small but systematic tendency to overbind the adsorbates relative to rALDA. This is opposite to the trend observed for the atomization energies of molecules and solids where RPA underestimates the bond strength by around 0.3–0.5 eV/atom relative to rALDA (see Figs 8 and 10). For metals, the cohesive energy is also underestimated by RPA but by somewhat smaller degree (MSE of \(-0.15\) eV relative to rALDA). The average deviation between RPA and rALDA for the bond energy per atom is shown in Table 4 for the set of materials in Figs 8 and 10 grouped according to material type.

Table 3 Adsorption energies (in eV) for a few reactions at full coverage calculated with rALDA, RPA, and seven different DFT xc-functionals
Fig. 14
figure 14

Dissociation curves of the H\({}_{2}\) molecule calculated with different functionals. The dashed line shows the energy of two isolated Hydrogen atoms (−1 Hartree). Each curve have been obtained by spline interpolation of 12 data points. Reproduced with permission from ref., 28 copyright American Physical Society, 2013

Table 4 Difference in atomization and cohesive energies between RPA, rALDA, and experiments for three different types of materials

The fact that molecule–metal bonding shows an opposite trend compared to atomization energies can be explained as follows: for reactions involving the breaking of covalent bonds in the initial adsorbate molecule, e.g., for reactions of the type 1/2A2–A/metal (reactions 1, 2, and 6–8 in Table 3), RPA will overestimate the reaction energy because the A–A bond strength is underestimated more than the A–metal bond (the latter being weaker than the former). For pure adsorption reactions of the form A2–A2/metal (reactions R3–R5), RPA will overestimate the adsorption energy because the reduction of the internal A–A bond upon adsorption is underestimated more than the A2/metal bond. In both cases, the reason for the (slight) overestimation of the reaction energy can thus be traced to a larger underestimation by the RPA in describing pure covalent bonds compared to bonds with partial metallic character.

Static correlation

A highly attractive feature of the RPA is the correct description of bond dissociation in molecular dimers, which cannot be captured by restricted semilocal functionals.3,117 The bond dissociation curves in RPA are, however, only accurate if the reference energy of the isolated atoms are subtracted. In the case of H\({}_{2}\), for example, the dissociation curve is 1.1 eV below the exact result due to the self-correlation energy of the H atom. As we have seen, the rALDA largely eliminates the self-correlation and one can obtain accurate dissociation curves of molecular dimers without subtracting reference energies for the isolated atoms. This is shown in Fig. 14, where we also compare with LDA, PBE, and Hartree–Fock, which all fail to yield the correct dissociation energy. We note, however, that RPA as well as rALDA exhibit a spurious maximum at intermediate bond lengths, which is not present in the exact result. The exact monotonic increase in energy signals a failure of both approximations. We note that it has previously been shown that the correct dissociation curve can be obtained from the ACFDT formalism if the interacting response function \(\chi\) is evaluated from either the Bethe–Salpeter equation118 or a systematic correction to EXX-RPA.23

Fig. 15
figure 15

Binding energy of bilayer graphene calculated with RPA and rALDA as well as LDA, PBE, and a van der Waals functional (vdW). Reproduced with permission from ref., 28 copyright American Physical Society, 2013

Finally it is worth mentioning that both RPA and rALDA fail dramatically for the dissociation of the H\({}_{2}^{+}\) dimer. In contrast, the SOSEX method yields the exact result for this problem but cannot dissociate H\({}_{2}\) correctly. To our knowledge, the only method that is capable of correctly dissociating both dimers is the ACFDT with EXX-RPA.21

Dispersive interactions

One of the main qualities of the RPA is its ability to account for vdW interactions in weakly interacting systems. Specifically, the RPA has been shown to provide an excellent description of the equilibrium geometry in hBN8 and graphite9 as well as graphite adsorbed on metal surfaces.11,12,94 Conserving the accurate description of dispersive interactions is thus a major success criterion for any bRPA method for ground state energies.

Bilayer graphene

In Fig. 15, we show the binding energy as a function of interlayer distance for a bilayer of graphene calculated with LDA, PBE, and the vdW functional of Dion et al.119 along with the results of RPA and rALDA. The RPA and rALDA is seen to yield nearly identical equilibrium distances of 3.4 Å, but rALDA gives a slightly smaller binding energy (22 meV/atom) compared to RPA (25 meV/atom). The semilocal functionals are not expected to capture dispersive interactions and PBE predicts a very shallow minimum at 4.4 Å. However, LDA yields an equilibrium distance of 3.3 Å, which is close to the RPA/rALDA value but only half the binding energy. It is remarkable that LDA provides a seemingly qualitative correct description of the binding energy in this system. Similar results for LDA have previously been obtained for graphite9 and graphene on metal surfaces,94 but the results are likely to be fortuitous, and the fact that LDA and various GGAs produce qualitatively different results renders the result dubious.

Fig. 16
figure 16

Binding energy of graphene on a Ni(111) surface calculated with RPA and rAPBE as well as LDA, PBE, RPBE, and a van der Waals functional (vdW-DF). Reproduced with permission from ref., 29 copyright American Physical Society, 2014

The vdW functional predicts an equilibrium distance of 3.7 Å and a binding energy of 22 meV/atom. In contrast to LDA, this functional binds the two layers for the right reasons, but the binding distance is quantitatively wrong. It is also noteworthy that the tails of the potential energy surfaces of RPA and rALDA coincide as expected but deviate from the prediction of the vdW functional. It should be noted that there is no experimental values for the binding energy and binding distance of bilayer graphene, but it is expected that the binding distance should be close to the value of 3.34 Å of graphite.

Graphene on metals

The description of graphene adsorbed on metal surfaces has proven a highly challenging task for first-principles methods due to equal contributions of weak covalent and vdW bonding in these systems.11,12 In Fig. 16, we compare the binding energy of graphene on a Ni(111) surface calculated with LDA, PBE, RPBE, vdW, RPA, and rAPBE.29 The RPA shows two distinct minima at 2.2 Å (chemisorbed) and 3.3 Å (physisorbed), which originates from weak covalent interactions and dispersive forces, respectively. The chemisorption minimum is a few meV lower than the physisorption minimum and agrees well with the experimentally determined value of 2.1 Å for this system. The rAPBE functional closely mimics the RPA result but gives a slightly larger energy difference between the two local minima. The vdW functional completely misses the weak covalent interactions responsible for the chemisorption minimum and only yields a physisorbed state at 3.8 Å.

C6 coefficients

Long-range interactions are typically dominated by Coulomb interactions and accuracy of dispersive interactions are thus determined by the response functions of the individual systems. For well-separated atoms, the binding energy asymptotically becomes \({E}_{{\rm{B}}}(r)={C}_{6}/{r}^{6}\), where the \({C}_{6}\) coefficients only depend on the polarizabilities of the isolated atoms. In particular, the \({C}_{6}^{ij}\) coefficient relating atoms \(i\) and \(j\) can be calculated from the Casimir–Polder formula

$${C}_{6}^{ij}=\frac{3}{\pi }{\int\nolimits_{0}}^{\infty }{\alpha }_{i}(i\omega ){\alpha }_{j}(i\omega )d\omega,$$
(57)

where

$${\alpha }_{i}(i\omega )=-\int {\rm{d}}{\bf{r}}{\rm{d}}{\bf{r}}^{\prime} z{\chi }_{i}({\bf{r}},{\bf{r}}^{\prime} ,i\omega )z^{\prime}$$
(58)

is the polarizability of atom \(i\) and \({\chi }_{i}\) is the interacting response function, which can be calculated from the Dyson equation with a given approximation for the xc-kernel. Alternatively, the dynamic polarizability may be calculated from the time-dependent density obtained from a real-time implementation of TDDFT.120

The \({C}_{6}\) coefficients for eight different atoms (\(i=j\)) are displayed in Table 5 calculated with LDA (\(\chi ={\chi }^{{\rm{KS}}}\)), RPA, and rALDA.28 We observe that RPA performs significantly better than LDA, but rALDA is a factor of three better than RPA on average. The performance is, however, very dependent on the type of atom and rALDA performs better for the noble gas atoms, except for He, which is more accurately described in RPA. For Li and Na, RPA fails completely, whereas LDA provides rather accurate predictions (better than rALDA for Li and slightly worse than rALDA for Na).

Table 5 C\({}_{6}\) coefficients between identical atoms (\(i=j\) in Eq. (57)) calculated with LDA, RPA, and rALDA28

Structural phase transitions

Bulk solids usually exist in various polymorphic forms. Under changes in pressure or temperature, one structure may transition into another. The structural phase transition of solids has large theoretical and practical importance. With the external influence, the space-group symmetry and associated internal structural parameters change from one crystal structure to another. Temperature- or pressure-induced structural phase transitions can also change the electronic structures of the corresponding materials, such as from insulator to metal and vice versa, resulting in changes of the band gap or conductance.121 Structural phase transitions also sometimes lead to different magnetic states.122 As a consequence, structural phase transitions offer an opportunity to tune a material toward particular applications in electronics, optics, and other relevant fields.123,124,125

Structural phase transitions have largely remained a challenge for electronic structure methods. When computing phase transitions, a robust theory must overcome the dilemma of simultaneously predicting equilibrium structural properties and the phase transition parameters. Since experiments often fail to precisely measure the coexistence temperatures or pressures of two different structural phases of a solid, there is a high demand for a robust theoretical method. The failure of semilocal density functional approximations for structural phase transitions was earlier attributed to the underestimation of the band gap with these methods.121,126,127,128 This assumption was later questioned by the results from the HSE (Heyd–Scuseria–Ernzerhof) approximation.129,130 HSE is nonlocal in the exchange and predicts more realistic fundamental gaps. HSE is better than semilocal functionals for the transition pressures of Si and SiO\({}_{2}\) but seriously overestimates the transition pressure in Zr.

In combination with the rAPBE and rALDA kernels, the RPAr approximation was investigated for the structural phase transitions of a small but representative group of materials.131 The examples were chosen to incorporate several changes in band structure, including from semiconductor to semiconductor, semiconductor to metal, and metal to metal transitions. The assessment includes the phase transitions of the diamond phase of Si to the metallic beta-tin form, the zinc-blende (ZB) to rocksalt transition of SiC, the ZB to Cmcm phase transition of GaAs, the quartz to stishovite transition of SiO\({}_{2}\), the transition from fcc to hcp structure of Pb, and finally the phase transition of the hexagonal to cubic structures of BN.

The transition pressure can be found as the negative slope of the common tangent line between the two phases. At the transition pressure, the difference in Gibbs free energies for the two phases should be equal to zero. At a finite, but constant, temperature, the pressure can be calculated as the negative derivative of the free energy with respect to the volume

$$P(V,T)=-{\left(\frac{\partial F}{\partial V}\right)}_{T}$$
(59)

At zero temperature, the equivalent condition is that the enthalpy difference of the two phases is zero, and the pressure can be found directly as the derivative of the electronic energy.

To include the thermal effects, the equilibrium parameters were obtained from fitting the third-order Birch–Murnaghan equation of state (EOS) including the thermal corrections from the vibrational degrees of freedom. The results of the fitting are \({F}_{0T}\), the minimum of the Helmholtz energy at temperature \(T\) and (\({V}_{0T}\), \({B}_{0T}\), \({B}_{0T}^{^{\prime} }\)), the equilibrium parameters such as the equilibrium volume, bulk modulus, and the derivative of the bulk modulus at that same temperature. The transition pressures were then obtained using the isothermal EOS fitting parameters obtained from Eq. (9) in ref. 131 Compared to zero Kelvin, the addition of thermal corrections introduces a rigid shift in the performance of all the methods for most of the materials in the assessment, and the agreement typically improves with experiment when the thermal corrections are included.

The results of different functionals for predicting the phase transitions generally follows the same trend as the structural parameters, Fig. (1) in ref. 131 In general, all bRPA methods deliver an overestimation of the equilibrium and transition volumes by about the same amount as the LDA underestimates them. bRPA approximations yielded an overall improvement compared to the bare RPA and semilocal functionals for the transition pressures, as shown in Table 6, although the behavior is less systematic than for RPA.

Table 6 Finite temperature transition pressures, in GPa, predicted at 300 K using different levels of density functional approximations131

For Si and Ge, the RPAr1 approximation delivers a reasonable result; however, RPA shows closer agreement with the experimental transition pressure. For SiC, neither RPA nor bRPA show agreement with the experimental transition pressure, indicating that there is a significant difference between equilibrium and non-equilibrium transition pressures.133 For SiO\({}_{2}\) and GaAs, the transition pressure predicted with an xc-kernel is more accurate compared to experiment than the bare RPA. Adding bRPA correlation from rAPBE at any level of RPAr reduces the transition pressure of RPA and comes quite close to the experiment. Reference 134 attributes this deviation of RPA to its poor performance for some molecular-like solids where there is less cancellation of error between dissimilar phases. It is remarkable that RPA fails to predict the correct phase ordering for BN, while the rAPBE kernel in conjunction with RPAr brings the transition pressure close to the experimental value. The thermal corrections play a more prominent role in the phase transitions of carbon and BN and indicate that further investigations are necessary. RPA yields too low transition pressure for the phase transition of carbon without thermal corrections and reverses the phase stability of BN without a finite-temperature correction. All bRPA approximations largely overestimate the transition pressure in C with the inclusion of thermal corrections, while the prediction for BN is significantly improved compared to RPA. The inclusion of the HOT with the rAPBE kernel leads to a better accuracy against the experimental transition pressure in BN. The unexpected inaccuracies in materials such as Pb, C and BN can be explained by near degeneracies.

To illustrate the benefits of including the xc-kernel, Fig. 17 shows the EOS data and common tangent results for SiO\({}_{2}\) predicted with RPA and the HOT approximation in combination with rAPBE. In this case, RPA underestimates the energy difference between the phases resulting in an underestimation of the transition pressure. The root cause of this can be understood if the transition pressure is thought of as \(\Delta E/\Delta V\) evaluated at the transition volumes for each phase. The energy underestimation is more severe than the transition volume errors, and so RPA underestimates the pressure (as with most semilocal functionals). With the addition of the xc-kernel, the energy difference between phases is increased by an appropriate amount to bring the rAPBE transition pressure within the experimental range.

Fig. 17
figure 17

Energy–volume curves for the quartz and stishovite phases of SiO\({}_{2}\) with RPA and rAPBE (within the HOT approximation) per functional unit.131 The negative slope of the common tangent line corresponds to the transition pressure. The kernel corrections for SiO\({}_{2}\) increase the equilibrium energy difference between phases and correct the large underestimation of the transition pressure by RPA as a result. The kernel-corrected curves have been rigidly shifted up in energy by 0.05 eV compared to RPA for visual clarity. Reproduced with permission from ref., 131 copyright American Physical Society, 2018

Cesium halide stability

The difficulty in predicting the energy difference between similar phases of a material is a more general problem than phase transitions alone. The stability of different phases in alkali halides is also a strong probe of various electronic structure methods and the correlation effects they incorporate. Among the alkali halides, those formed from cesium show an interesting behavior. CsF is experimentally stable in the B1 (rocksalt or NaCl) structure, while the Cl, Br, and I materials exist experimentally in the B2 (CsCl) structure. For all of the other alkali halides, the stable structure is B1. The stability of the B2 phase for certain cesium halides is a direct consequence of weak vdW bonding.135,136,137,138,139,140 All ACFDT-based methods naturally account for long-range vdW forces,141 but an accurate treatment of their structure requires both short- and long-range interactions to be accounted for. Semilocal density functional approximations miss the long-range vdW forces. Reference 139 indicated the relevance of long-range interactions through the PBE+D2 method compared to the bare PBE-GGA for the phase ordering of these cesium halides. Since D2 is a semiempirical method,142 the energy differences reported in ref. 139 are not necessarily usable as benchmarks.

To go beyond the semilocal level, the energy differences between the B1 and B2 phases were explored using ACFDT-based methods. The performance of the previously discussed ACFDT-based approximations was assessed in detail in ref. 143 for the structural parameters and cohesive energies of cesium halides. The rALDA kernel was used primarily for this study. Compared to semilocal density functionals, RPA yields superior structural parameters for all of the stable cesium halides. bRPA approximations in combination with the rALDA kernel are in general even more accurate for predicting the lattice constants and bulk moduli. The predicted cohesive energies for all of these ACFDT methods are also more accurate compared to the PBE-GGA due to the incorporation of vdW interactions. RPA predicts the proper phase stability of all Cs halides, whereas PBE only predicts the correct order for the fluoride, Fig. 18.

Fig. 18
figure 18

Bar graph summarizing the difference in cohesive energies, \(\Delta {E}_{{\rm{coh}}}={E}_{{\rm{coh}}}^{{\rm{B1}}}-{E}_{{\rm{coh}}}^{{\rm{B2}}}\), obtained with various DFT methods. PBE+D2 results are taken from ref. 139 Positive \(\Delta {E}_{{\rm{coh}}}\) corresponds to the B1 phase being preferred as the ground state, whereas negative values indicate the preferred stability of the B2 phase. PBE predicts all ground state cesium halides to be in the B1 phase, whereas all other methods favor the B2 structure except in CsF. S+D3144 and S+rVV10145 correspond to the SCAN146 semilocal results plus the dispersion method specified. Reproduced with permission from ref., 143 copyright American Physical Society, 2018

By definition, the cohesive energy includes the energy of the bulk and the constituent atoms. Describing accurately the energy of the bulk and free atoms simultaneously is a challenge for most electronic structure methods, and a method biased toward one paradigm will lead to inconsistent predictions of the cohesive energy. Depending on the kernel and level of RPAr approximation, the resulting methods yield different descriptions for the bulk and free atoms. Overall, the cohesive energies of the stable Cs halides are more accurate with RPA for the F and Cl compounds than for the heavier halides. For Br and I, the rALDA kernel within RPAr improves the predicted cohesive energies compared to RPA. A possible explanation is that RPA is applied in a non-self-consistent manner with PBE input orbitals, which is not necessarily the best starting point for the more ionic X–F and Cl bonds. The addition of a kernel tends to improve the short-range correlation of the ACFDT and therefore compensates the error of the PBE orbitals that is more prominent in the cohesive energies for RPA for the larger Cs halides.

In order to correctly predict the splitting between the B1 and B2 phases, a delicate balance of short- and long-range correlation is required, Fig. 19. The large overestimation of the stability difference between these two phases with the PBE+D2 method indicates the incompleteness of this approximation. RPA in contrast correctly incorporates long-range correlation but is incomplete for the short-range part and so underestimates the energy difference. The energy difference is increased by any RPAr approximation using the rALDA kernel. rAPBE could be naturally expected to be the ideal kernel for evaluating the bRPA corrections on top of PBE orbitals, but interestingly this kernel struggles to predict the correct phase ordering in these materials. The other kernels tested, including rALDA, CP07,71 and CDOP,69 all predict a consistent phase stability ordering, regardless of the RPAr approximation, further indicating that the error lies in the rAPBE kernel itself.143 This seems to be an isolated case, however, as the rest of the results in this review clearly demonstrate the utility and success of the rADFT kernels.

Fig. 19
figure 19

Bar graph representing the difference in cohesive energies, as in Fig. 18, obtained with beyond RPA methods using the rALDA kernel. The dispersion-corrected SCAN and RPA results are included for comparison. A positive \(\Delta {E}_{{\rm{coh}}}\) indicates that the B1 phase is the preferred ground state, whereas negative values indicate the preferred stability of the B2 phase. SCAN plus dispersion gives a much more reasonable prediction compared to PBE+D2, if the ACFDT results are taken as the benchmark. Reproduced with permission from ref. 143 copyright American Physical Society, 2018

Vertex-corrected QP energies

This section presents some results for QP energies obtained using the G\({}_{0}\)W\({}_{0}\), GW\({}_{0}\), G\({}_{0}\)W\({}_{0}\)P\({}_{0}\), and G\({}_{0}\)W\({}_{0}{\Gamma }_{0}\) self-energy methods outlined in section “Hedin’s equations and vertex corrections.” For the latter two, the rALDA\({}_{{\rm{x}}}\) xc-kernel was used and all calculations used an LDA starting point. The GW\({}_{0}\) refers to eigenvalue self-consistency in the Green function. All structures were relaxed using the PBE xc-functional and QP calculations were based on norm-conserving PAW potentials and spin–orbit coupling was included for the band structures. Further details on the calculations can be found in ref. 30

Table 7 reports the band gaps obtained with the different self-energy methods together with their MAE and MSE relative to the experimental reference values for eight bulk crystals and three transition metal dichalcogenides in monolayer form. As expected, G\({}_{0}\)W\({}_{0}\)@LDA underestimates the experimental band gaps. In agreement with previous findings, iterating to self-consistency in the Green’s function (the GW\({}_{0}\) method) improves the situation somewhat but leads to a small systematic overestimation of the gaps. This overestimation becomes even larger in fully self-consistent GW (not shown) where the MAE and MSE increase to 0.7 eV (see ref. 30). Including the rALDA vertex correction in a non-self-consistent G\({}_{0}\)W\({}_{0}{\Gamma }_{0}\) calculation reduces the systematic underestimation of the gap somewhat (the MSE is reduced from \(-0.19\) to \(-0.12\) eV). Summarizing, we find that G\({}_{0}\)W\({}_{0}{\Gamma }_{0}\) and GW\({}_{0}\) perform the best for the band gaps closely followed by G\({}_{0}\)W\({}_{0}\). Including the vertex only in the polarizability (G\({}_{0}\)W\({}_{0}\)P\({}_{0}\)) closes the gap because of the increased screening (as previously reported in refs 147,148) and results in a significant underestimation of the gaps.

Table 7 Band gaps obtained using different self-energy approximations (see section “Hedin’s equations and vertex corrections”) for eight bulk crystals and three monolayers

Figure 20 show the absolute band positions for the valence band maximum (VBM) and conduction band minimum (CBM) relative to vacuum. For the bulk materials, the band positions were determined by aligning the Hartree potential of a bulk calculation with the potential in the center of a slab. The slab thickness was 10–24 atomic layers depending on the material and the surface orientation and reconstruction were taken from available experimental studies. The inclusion of the vertex has a striking effect of blue shifting the band edges by around 0.5 eV. Remarkably, this upshift yields a better overall agreement with experimental values (dashed black lines). Interestingly, this shift of band energies is not observed when the vertex is only included in the polarizability. In addition, no systematic shift of the band edges is observed for the self-consistent GW approximations (without vertex corrections). We are thus led to conclude that the blue shift of band energies originates from the inclusion of vertex corrections in the self-energy.

Fig. 20
figure 20

Absolute QP band positions relative to vacuum for a range of semiconductors and insulators as calculated with various methods. All calculations are performed non-self-consistently from the LDA wave functions and eigenvalues. Experimental values are shown where available (black). The results are adapted from ref. 30 Reprinted with permission from ref. 30 Copyright 2019 by the American Physical Society

The physical origin of these effects can be traced to the improved description of short-range correlations provided by the rALDA vertex. Indeed, the induced potential from Eq. (44) is the Hartree potential generated by the induced density (the screening cloud of the QP). This Hartree potential is too deep and results in too deep-lying QP energies. This is the same reason underlying the systematic underestimation of the correlation energy by RPA. Adding the induced xc-potential by Eq. (42) reduces the size of the screening potential and thus shifts the QP energies up. As discussed in ref., 30 the band gap size is governed by long-range correlations, which are well described by the RPA, while the absolute band energies also depend on the short-range correlations whose proper description require the vertex function.

The observed upshift in QP energies by around 0.5 eV due to the vertex correction can also be understood by noting that the ionization potential and electron affinity, i.e., the VBM and CBM relative to vacuum, obtained from G\({}_{0}\)W\({}_{0}\) (G\({}_{0}\)W\({}_{0}{\Gamma }_{0}\)) can be related to total energy differences between \(N\) and \(N\pm 1\) ground states evaluated from the ACFDT formula employing the RPA (rALDA) with a “frozen orbitals” assumption, i.e., a generalized Koopman’s theorem.149 Indeed, as we have shown in section “Absolute correlation energies,” the RPA underestimates the absolute correlation energy by around 0.3–0.6 eV/electron and this error is largely repaired by rALDA due to the improved description of the short-range correlations (see Fig. 1). The incorrect description of absolute correlation energies by RPA largely cancels for energy differences when the states in question contain the same number of electrons. However, for QP energies where the initial and final states differ by \(\pm 1\) electron, such errors are directly revealed.

Finally, it should be mentioned that the suppression of the large \(q\) components in the self-energy resulting from the rALDA kernel not only improves the description of local correlations but also leads to faster convergence with respect to the number of plane waves and unoccupied states, as compared to standard GW calculations.30 The situation is very similar to that reported in Fig. 4 for the ground state correlation energy in rALDA versus RPA and ALDA.

Conclusions and outlook

We have reviewed the theory of xc-kernels derived from the HEG and illustrated how they can be used to obtain ground state correlation energies and QP band structures beyond the RPA and GW methods, respectively. While several xc-kernels have been introduced, we have mainly focused on the renormalized adiabatic kernels rALDA and rAPBE, which are obtained from the (semi)local LDA and PBE xc-functionals by a simple renormalization procedure. The renormalization procedure consists of a truncation of the large \(q\)-components of Fourier-transformed xc-kernel, which renders it non-local in real space—an essential property to avoid a divergent on-top correlation hole.

The main observation is that the xc-kernels greatly improve the description of the short-range correlations relative to RPA (which corresponds to setting \({f}_{{\rm{xc}}}=0\)). In particular, the coupling constant averaged correlation hole, obtained from the density response function via the fluctuation dissipation theorem, becomes almost exact for the HEG and reduces the error on the correlation energy from about 0.5 eV/electron in the RPA to <0.05 eV/electron. This improvement was shown to carry over from the extreme limit of delocalized electrons in the HEG to limit of localized states of simple atoms and molecules. For example, the spurious self-correlation energy of the hydrogen atoms is reduced from 0.56 eV (RPA) to 0.04 eV (rALDA). The much better reproduction of absolute correlation energies reduces the reliance on error cancellation effects when computing energy differences. For RPA, error cancellation is very significant, of the order of 0.5 eV/electron. For this reason, RPA performs well for isoelectronic problems, where the electronic structure of the initial and final states are similar, e.g., for the calculation of structural parameters or the breaking/formation of weak (dispersive) bonds. When the xc-kernels are invoked, the need for error cancellation is reduced, and as a consequence, covalent bond energies, for which short-range correlations play an important role, are much more accurately described. This was explicitly demonstrated for atomization energies of molecules, cohesive energies of solids, formation energies of metal oxides, surface energies, and chemisorption energies of atoms and molecules on metal surfaces. In all these cases, the xc-kernels cure the systematic underbinding by the RPA and significantly improve the agreement with experimental data. Importantly, because the xc-kernels are of short-range nature they do not affect the shape of the correlation hole at longer distances, and consequently the excellent performance of the RPA for vdW interactions is preserved or even improved. For example, the MAE on the \({C}_{6}\) coefficients of noble gas and alkali elements are reduced from 0.3 eV (RPA) to 0.1 eV (rALDA). Similar conclusions apply to bonding with intermediate correlation ranges such as the mixed covalent-dispersive forces governing organic–metal interfaces as exemplified here by the prototypical graphene/metal interfaces. Furthermore, the positive effects that the xc-kernels bring can be adequately captured using RPA renormalized perturbation theory to first order, demonstrating that the impact of higher-order correlation effects are less important for correcting the shortcomings of RPA for structural and energetic properties.

In the context of QP band structure calculations within the formal framework of Hedin’s equations, we have shown that the renormalized xc-kernels can be used to include vertex corrections in the electron self-energy. When the four-point kernel of MBPT is approximated by the two-point kernel of time-dependent DFT, the effect of the vertex corrections attains a simple and physically transparent form. Namely, the effect of the vertex is to include the xc-potential into the dynamically screened Coulomb potential. We have shown that this is crucial in order to obtain a correct description of absolute band energies relative to the vacuum level. The effect of the xc-potential is to reduce the magnitude of the attractive QP screening cloud (in GW, only the Hartree potential of the screening cloud is accounted for) and this leads to a significant upshift of 0.3–0.5 eV for all energy levels leaving the band gap unchanged to within 0.2 eV. These effects should be important to include in order to correctly describe band alignment at surfaces and interfaces.

While the adiabatic xc-kernels discussed in this review provide an improved description of the short range correlations leading to the many positive derived effects described above, they still present shortcomings. Most importantly, they do not provide any improvements for strongly correlated systems such as Mott insulators. For such systems, it might be necessary to include frequency dependence or develop kernels with a more sophisticated density dependence than the HEG-based kernels considered here. Another shortcoming is the failure to reproduce the correct wavevector dependence in the limit of long wavelengths. This is not an important issue for total energy calculations, but for optical absorption spectra, it is crucial to have the right limiting behavior in order to capture excitonic effects. The deficiency stems from the fact that the renormalization scheme is based on the correlation hole of the HEG and a proper treatment of the long wavelength limit in insulators is likely to require a scheme that is not solely based on the density. One possible route toward this is to note that the truncation factor \(\theta (2{k}_{{\rm{F}}}-q)\) is the Fourier transform of the density matrix of a HEG with Fermi wavevector \(2{k}_{{\rm{F}}}\). In contrast to the bare density (defined by \({k}_{{\rm{F}}}\)), the density matrix provides a qualitative distinction between insulators and metals,150 but a direct truncation scheme based on the density matrix is not straightforward. Finally, the issue of self-consistency deserves a comment. In principle, it is possible to define an optimized local effective potential (OEP) from the ACFDT with renormalized kernels, which may then form the basis of self-consistent solutions in the framework of the KS equations. However, the non-self-consistent calculations are already rather computationally demanding and the OEP method does not seem to be a viable route to follow. In addition, the adiabatic kernels considered in the present work cannot be derived from any known ground state energy functional, which implies that it is somewhat inconsistent to evaluate the rALDA energy based on LDA orbitals, for example. On the other hand, the situation is certainly improved compared to RPA where the Hartree orbitals comprises the only natural choice, but it would be highly desirable to have a class of xc-energy functionals that yield the renormalized kernels as their second functional derivative with respect to the density. A possible step in this direction is to define an LDA functional where the input density is replaced by a density convoluted with the Fourier transform of the truncation factor \(\theta (2{k}_{{\rm{F}}}-q)\). This yields a ground state xc-energy functional, which leads directly to rALDA kernel if the density dependence of \({k}_{{\rm{F}}}\) is neglected.29 Although such a scheme is approximate, the construction itself provides an intriguing new route to the development of novel xc-energy functionals that incorporate non-local effects through weighted density averaging.