1 Introduction

The existence of a short-distance scale \(Q\gg \varLambda _\mathrm{QCD}\) and infrared (IR) finiteness are important requirements for the application of perturbation expansions to strong interaction (QCD) processes, but they are not always enough in practice. The perturbative series is divergent, most likely asymptotic. One of the sources of divergent behaviour, called IR renormalon [1,2,3,4,5], arises from the sensitivity of the process to the inevitable long-distance scale \(\varLambda _\mathrm{QCD}\). The degree of IR sensitivity limits the ultimate accuracy of the perturbative approximation. For the pole mass of a heavy quark, this observation has been exceptionally important for particle physics phenomenology, leading to a better understanding of quark mass renormalization at scales of order and below the mass of the quark, and to much improved precision in heavy quark and quarkonium physics.

The quark two-point function

$$\begin{aligned}&\int d^4 x \,e^{i p x}\,\langle \varOmega |T(q_{a}(x)\bar{q}_b(0)) |\varOmega \;\rangle \, {\mathop {\rightarrow }\limits ^{p^2\rightarrow m^2}}\nonumber \\&\quad \delta _{ab} \,Z \, \frac{i (\not p +m)}{p^2-m^2}+ \text{ less } \text{ singular } \end{aligned}$$
(1)

has a poleFootnote 1 to any order in the perturbative expansion. The location of the pole in the complex \(p^2\) plane defines the pole mass of the quark, m. The pole is shifted off the real axis by a small amount due to the weak decay of the quark, but for the discussion in this article the imaginary part is not relevant and hence will be ignored. The pole mass of a quark is IR-finite [6, 7]. It can be related to other renormalized quark mass definitions order by order in perturbation theory. It is nevertheless not physical, as quarks do not exist as free, asymptotic particle states and the scattering matrix of QCD does not exhibit a pole at \(m^2\).

It is intuitively obvious that the strong IR physics of QCD, which is not captured by perturbation theory, should contribute an amount \(\varLambda _\mathrm{QCD}\) to hadron masses. For a meson state M composed of \(\bar{q}_i q_j\),

$$\begin{aligned} M=m_i+m_j + \text{ const } \times \varLambda _\mathrm{QCD} + \cdots , \end{aligned}$$
(2)

which renders the notion of the pole mass useless for light quarks with masses \(m_i \lesssim \varLambda _\mathrm{QCD}\). However, for mesons containing heavy quarks with \(m_Q\gg \varLambda _\mathrm{QCD}\), the pole mass provides a first approximation to the meson mass up to power corrections of relative order \(\varLambda _\mathrm{QCD}/m_Q\). Starting from, say, the \(\overline{\mathrm{MS}}\) mass of the heavy quark, the pole mass represents the perturbatively calculable leading-power approximation to the meson mass, just as the perturbative calculation in terms of quarks and gluons of the total \(e^+ e^- \rightarrow \) hadrons cross section at high energy does to the physical hadroproduction cross section.

There is a deep connection between power corrections and IR renormalon divergence of the QCD perturbative expansion. The existence of a linear power correction is related to the strong IR renormalon divergence of the pole mass series that was discovered in [8, 9] and is the subject of this article. In consequence, while the pole mass of a quark appeared as the natural choice for processes involving nearly on-shell heavy quarks, the concept has since largely been abandoned in precision calculations in favour of alternative (leading) renormalon-free mass definitions.

2 Pole mass series

2.1 Basic definitions

The pole mass m is related to the \(\overline{\mathrm{MS}}\) mass by

$$\begin{aligned} m= & {} m(\mu _m) \bigg (1+\sum _{n=1}^\infty c_{n}(\mu ,\mu _m,m(\mu _m)) \,\alpha _s^{n}(\mu ) \bigg ).\quad \end{aligned}$$
(3)

Here \(\alpha _s(\mu )\) is the \(\overline{\mathrm{MS}}\) coupling at scale \(\mu \) in QCD with \(n_l\) light quarks, and \(m(\mu _m)\) stands for the heavy quark \(\overline{\mathrm{MS}}\) mass evaluated at the scale \(\mu _m\). In the following I will often set \(\mu _m=\overline{m}\), where \(\overline{m}\) refers to the \(\overline{\mathrm{MS}}\) mass, evaluated self-consistently at the scale equal to the mass itself, i.e.

$$\begin{aligned} \overline{m}=m(\overline{m})\,. \end{aligned}$$
(4)

We shall see below that the series (3) diverges for any value of \(\alpha _s\not =0\). Although a proof does not exist for QCD, it is reasonable to assume that it is asymptotic, and approximates the heavy-meson mass up to exponentially small terms in \(\alpha _s\), equivalent to power corrections in \(\varLambda _\mathrm{QCD}/\overline{m}\). Asymptotic expansions can sometimes be summed using the Borel transform. Given a power series

$$\begin{aligned} f(\alpha _s)=\sum _{n=1}^{\infty } c_n \alpha _s^n\,, \end{aligned}$$
(5)

the corresponding Borel transform is defined by

$$\begin{aligned} B[f](t)=\sum _{n=0}^{\infty } c_{n+1}\,\frac{t^n}{n!}\,. \end{aligned}$$
(6)

By convention, the tree-level term “1” in (3) is excluded from the definition. A factorially divergent series of the form

$$\begin{aligned} r_n = K a^n \varGamma (n+1+b) \end{aligned}$$
(7)

has the Borel transform

$$\begin{aligned} B[f](t) = \frac{K \varGamma (1+b)}{(1-a t)^{1+b}} \end{aligned}$$
(8)

with a singularity at \(t=1/a\). The Borel integral

$$\begin{aligned} I[f]\equiv \int _0^\infty dt\,e^{-t/\alpha _s}\, B[f](t) \end{aligned}$$
(9)

has the same series expansion as \(f(\alpha _s)\) and provides the exact result under suitable conditions. However, for our case of interest, there will be a singularity on the integration contour, rendering the Borel integral as given ill-defined. Deformations of the contour around the pole or branch cut, or the principal-value prescription, result in the ambiguity

$$\begin{aligned} \delta f \equiv |\text{ Im }\,I[f]| = \frac{\pi |K|}{a}\,e^{-1/(a\alpha _s)}\, (a\alpha _s)^b\qquad \text{(if } a>0\text{) } \nonumber \\ \end{aligned}$$
(10)

of the Borel integral, which has the form of a power correction proportional to \(\varLambda _\mathrm{QCD}^{-2\beta _0/a}\). The QCD \(\beta \)-function is defined as \(\beta (\alpha _s) \equiv \mu ^2\partial \alpha _s(\mu )/\partial \mu ^2 = \beta _0 \alpha _s^2+\beta _1\alpha _s^3+\cdots \) with \(\beta _0 = -(11-2 n_f/3)/(4\pi )\). This ambiguity provides a quantitative measure of the limit to the accuracy of a purely perturbative calculation.

The linear power correction to the pole mass therefore corresponds to \(a=-2\beta _0\). More generally, the pole mass can be regarded as the first term of an asymptotic expansion of the meson mass in powers of \(\alpha _s\) and \(\varLambda _\mathrm{QCD}/\overline{m}\), which in modern language has a trans-series structure (again, no proof).Footnote 2

2.2 Linear IR sensitivity and the large-\(n_f\) approximation

To gain intuition, we start from the leading IR renormalon divergence of the one-loop correction to the pole mass with fermion-loop insertions into the gluon line, often referred to as the large-\(n_f\) approximation. The relevant expression is

$$\begin{aligned} \varDelta m\equiv & {} m-m(\mu ) = (-i)g_s^2 C_F \mu ^{2\epsilon }\nonumber \\&\times \int \frac{d^d k}{(2\pi )^d}\,\frac{\gamma ^\mu (\not p+ \not k+m)\,\gamma ^\nu }{k^2\,((p-k)^2-m^2)}\big |_{p^2=m^2} \nonumber \\&\times \left( g_{\mu \nu }-\frac{k_\mu k_\nu }{k^2}\right) \,\sum _{n=0}^\infty \left[ \beta _0\alpha _s \ln \left( \frac{-k^2 e^{-5/3}}{\mu ^2}\right) \right] ^{ n}\nonumber \\&+ \text{ counterterms }. \end{aligned}$$
(11)

The all-order \(\overline{\mathrm{MS}}\) counterterms can be found in [8]. They do not diverge factorially, and can therefore be ignored when discussing the large-n behaviour. Strictly speaking, the fermion-loop insertions provide only the \(n_f\)-dependent part of \(\beta _0\) in (11). In full QCD, consistency of the trans-series interpretation of short-distance expansions requires the full expression for \(\beta _0\), as will be seen below. The diagrammatic recovery of the full \(\beta _0\) is discussed in [10]. The substitution of the full \(\beta _0\) in fermion bubble-chain diagrams is often referred to as “naive non-abelianization” [12, 13].

For \(p^2=m^2\) the integral scales as \(d^4 k/k^3\) for small k. It is thus IR finite, but the contributions from k smaller than \(\varLambda _\mathrm{QCD}\), where perturbation theory is not valid, is of order \(\varLambda _\mathrm{QCD}\). That this should imply that the pole mass cannot be defined to better accuracy than \({\mathcal {O}}(\varLambda _\mathrm{QCD})\) was noted in [14]. The connection to the IR renormalon divergence of the perturbative expansion was established shortly after [8, 9]. Indeed, the increasing power of logarithms in (11) enhance the IR region and yield (after Wick rotation)

$$\begin{aligned} \int _0^\lambda dk\,\ln ^n\left( \frac{k^2}{\mu ^2}\right) {\mathop {=}\limits ^{n\gg 1}} (-2)^n n!\, \end{aligned}$$
(12)

with typical \(k\sim \mu \,e^{-n}\). One can also take the Borel transform of (11), sum the series, which yields an effective gluon propagator. The exact expression for the Borel transform of \(m-m(\mu )\) can be found in [8]. Approximating the integrand to its leading term in the small-k behaviour is sufficient to obtain the dominant IR renormalon singularity at \(t=-1/(2 \beta _0)\) (closest to the origin of the Borel plane), resulting in

$$\begin{aligned} B[\varDelta m] = \frac{C_F e^{5/6}}{\pi }\, \mu \,\frac{1}{1+2\beta _0 t}\,. \end{aligned}$$
(13)

The corresponding asymptotic behaviour of the series expansion in \(\alpha _s=\alpha _s(\mu )\) is

$$\begin{aligned} m_\mathrm{pole}-m_{\overline{\mathrm{MS}}}(\mu ) = \frac{C_F e^{5/6}}{\pi }\, \mu \,\sum _n (-2\beta _0)^n\,n!\,\alpha _s^{n+1}.\nonumber \\ \end{aligned}$$
(14)

As expected, with \(a=-2\beta _0>0\) the ambiguity of the Borel integral and hence of the pole massFootnote 3 is proportional to \(\varLambda _\mathrm{QCD}\), independent of the arbitrary renormalization scale \(\mu \).

2.3 Exact characterization of the divergence

The relation of IR and ultraviolet (UV) renormalon divergence with the small- and large-momentum behaviour of Feynman diagrams, respectively, allows for a precise characterization of the corresponding singularities in the Borel transform in terms of the factorization properties of observables and correlation functions in these limits. In asymptotically free, renormalizable field theories the UV renormalon singularities occur at \(t=n/\beta _0<0\) and can be related to local operators of dimension \(4+2 n\) in the regularized theory expanded for large values of a dimensionful cut-off [15] (see [16] for the case of QCD). If the theory has power UV divergences, as is usually the case for effective field theories, the UV renormalon singularities extend into the positive real axis of the Borel plane. Likewise, the IR behaviour of correlation functions is often amenable to expansions in the ratio of \(\varLambda _\mathrm{QCD}\) and a hard scale, in which case the IR renormalons at \(t=-n/\beta _0>0\) are related to higher dimensional terms in the operator product expansion (OPE). For example, for the two-point function of two vector currents,

$$\begin{aligned} \varPi (Q)&= C_0(\alpha _s,Q/\mu ) + \frac{1}{Q^4}\,C_{GG}(\alpha _s,Q/\mu )\, \langle \frac{\alpha _s}{\pi } G^2\rangle (\mu )\nonumber \\&\quad + O(1/Q^6), \end{aligned}$$
(15)

the position \(t=-2/\beta _0\) of the leading IR renormalon divergence of its perturbative series \(C_0(\alpha _s,Q/\mu )\) is determined by the dimension (four) of the gluon condensate correction [2,3,4,5]. For both, UV and IR renormalons, the parameter b in (8) is determined by the anomalous dimension(s) of the relevant operators. Through renormalization-group equations (RGEs), one can determine the \(\alpha _s\) dependence of the ambiguity of the Borel integral and thus determine 1/n corrections to the leading large-order behaviour in terms of OPE coefficient functions, the anomalous dimensions of all operators of a given dimension, and the beta function coefficients [17].Footnote 4 This leads to the remarkable conclusion that the singular points of the Borel transform due to IR and UV renormalons can be completely specified, except for a set of normalization constants K in (8), whose number matches (at most) the number of operators. They appear as initial conditions of the RGE and should be viewed as non-perturbative [17, 18].

The application of these ideas to the large-order behaviour of the pole mass series exhibits some unique features a) due to the linear IR sensitivity, the IR renormalon divergence is particularly strong and dominates over the sign-alternating UV renormalon series; b) the leading IR renormalon singularity at \(t=-1/(2\beta _0)\) involves only a single operator and therefore a single unknown normalization constant; c) the operator has vanishing anomalous dimension; hence, the sub-asymptotic \(1/n^k\) corrections are determined only from the QCD beta-function, which is known to high order in perturbation theory.

The derivation of these statements in [19] builds on the observation [8] that the leading IR renormalon in the pole mass is related to an UV renormalon pole at the same position \(t=-1/(2\beta _0)\) of the self-energy \(\varSigma ^\mathrm{static}\) of the static quark field in heavy quark effective theory (HQET) with Lagrangian

$$\begin{aligned} {\mathcal {L}}_\mathrm{eff} = \bar{h}_v iv\cdot D h_v + {\mathcal {L}}_\mathrm{light}\,. \end{aligned}$$
(16)

This UV renormalon pole exists, because in contrast to full QCD, the static self-energy is linearly UV divergent. The only operator with the required mass dimension three is \(\bar{h}_v h_v\). It follows that the imaginary part of the Borel integral of \(\varSigma ^\mathrm{static}\) is given by

$$\begin{aligned} \text{ Im }\,I[\varSigma ^\mathrm{static}](\alpha _s,p,\mu ) = E(\alpha _s,\mu )\,\varSigma ^\mathrm{static}_{\bar{h} h}(\alpha _s,p,\mu ),\nonumber \\ \end{aligned}$$
(17)

where \(\varSigma ^\mathrm{static}_{\bar{h} h}\) is the static self-energy with a zero-momentum insertion of \(\bar{h}_v h_v\). The coefficient \(E(\alpha _s,\mu )\) satisfies the RGE

$$\begin{aligned} \left( \mu ^2\frac{\partial }{\partial \mu ^2}+\beta (\alpha _s) \frac{\partial }{\partial \alpha _s}-\gamma _{\bar{h}_v h_v}(\alpha _s) \right) E(\alpha _s,\mu ) = 0\,.\nonumber \\ \end{aligned}$$
(18)

However, \(\gamma _{\bar{h}_v h_v}\) vanishes to all orders in perturbation theory, since \(\bar{h}_v h_v\) is the conserved heavy quark number current of HQET. This justifies statements b) and c). One then shows that [19]

$$\begin{aligned}&\text{ Im }\,I[\varDelta m] = -E(\alpha _s,\mu ) = \mathrm{const}\nonumber \\&\quad \times \mu \,\exp \left( \int _{\alpha _s} dx\frac{1}{2\beta (x)}\right) = \mathrm{const}\times \varLambda _\mathrm{QCD}. \end{aligned}$$
(19)

The \(\alpha _s\)-dependence of the imaginary part of the Borel integral determines the large-order behaviour of the perturbative expansion of \(\varDelta m\) according to (7, 9) up to a single normalization constant, N. Defining

$$\begin{aligned}&c_{n}(\mu ,\mu _m,m(\mu _m)) \underset{n\rightarrow \infty }{\longrightarrow }N c^{(\mathrm as)}_{n}(\mu ,m(\mu _m)) \nonumber \\&\quad \equiv N \frac{\mu }{m(\mu _m)}\,\tilde{c}^{(\mathrm as)}_{n}\,, \end{aligned}$$
(20)

the result is

$$\begin{aligned} \tilde{c}_{n+1}^{(\mathrm as)}= & {} (-2\beta _0)^n\, \frac{\varGamma (n+1+b)}{\varGamma (1+b)} \Bigg [1+\frac{s_1}{n+b}\nonumber \\&+\frac{s_2}{(n+b)\,(n+b-1)} \nonumber \\&+\,\frac{s_3}{(n+b)\,(n+b-1)\,(n+b-2)} + \cdots \Bigg ],\nonumber \\ \end{aligned}$$
(21)

where [19, 20] \(b=-\beta _1/(2\beta _0^2)\) and

$$\begin{aligned} s_1= & {} \left( -\frac{1}{2\beta _0}\right) \left( -\frac{\beta _1^2}{2 \beta _0^3}+\frac{\beta _2}{2\beta _0^2}\right) , \end{aligned}$$
(22)
$$\begin{aligned} s_2= & {} \left( -\frac{1}{2\beta _0}\right) ^2\left( \frac{\beta _1^4}{8 \beta _0^6}+\frac{\beta _1^3}{4\beta _0^4}-\frac{\beta _1^2\beta _2}{4 \beta _0^5}-\frac{\beta _1\beta _2}{2\beta _0^3}\right. \nonumber \\&\left. +\frac{\beta _2^2}{8 \beta _0^4}+\frac{\beta _3}{4\beta _0^2}\right) , \end{aligned}$$
(23)
$$\begin{aligned} s_3= & {} \left( -\frac{1}{2\beta _0}\right) ^3\bigg ( -\frac{\beta _1^6}{48 \beta _0^9} -\frac{\beta _1^5}{8\beta _0^7} -\frac{\beta _1^4}{6 \beta _0^5} +\frac{\beta _1^4\beta _2}{16 \beta _0^8}\nonumber \\&+\frac{3 \beta _1^3 \beta _2}{8 \beta _0^6} +\frac{\beta _1^2 \beta _2}{2 \beta _0^4} -\frac{\beta _1^2 \beta _2^2}{16 \beta _0^7} -\frac{\beta _1^2 \beta _3}{8 \beta _0^5} -\frac{\beta _1 \beta _2^2}{4 \beta _0^5}\nonumber \\&-\frac{\beta _1 \beta _3}{3 \beta _0^3} +\frac{\beta _2^3}{48\beta _0^6} -\frac{\beta _2^2}{6\beta _0^3}+\frac{\beta _2 \beta _3}{8 \beta _0^4} +\frac{\beta _4}{6 \beta _0^2} \bigg )\,. \end{aligned}$$
(24)

The pole mass series is particularly simple, because the large-order behaviour is completely determined in terms of the \(\beta \)-function coefficients. Since the five-loop beta-function coefficient \(\beta _4\) is now known [21,22,23], the sub-asymptotic behaviour including \(1/n^3\) corrections is known. Numerically, for the interesting cases discussed below, the corrections to the leading large-n behaviour do not exceed 3% for \(c_4\).

2.4 The top, bottom and charm mass series

The normalization constant N in (20) cannot be determined exactly by purely perturbative methods. However, given that the pole-\(\overline{\mathrm{MS}}\) mass relation (3) is known to the four-loop order [24, 25] and the asymptotic behaviour is known including \(1/n^3\) corrections, one may attempt to match the two at \(n=4\). In other words, while

$$\begin{aligned} N= & {} \lim _{n\rightarrow \infty } \frac{c_{n}(\mu ,\mu _m,m(\mu _m))}{c^{(\mathrm as)}_n(\mu ,m(\mu _m))}\,, \end{aligned}$$
(25)

we evaluate the ratio for \(n=4\) and check the stability of the result by comparison with \(n=3\) [20]. In the following, all \(n_l\) quarks other than the heavy quark will be assumed to be massless. The effect of internal quark mass effects will be discussed below. The coupling \(\alpha _s\) in (3) is the \(\overline{\mathrm{MS}}\) coupling in the \(n_l\)-flavour theory. Since the IR theories are different, N is expected to depend on \(n_l\).

It is interesting to apply this method to the large-\(n_l\) limit (more precisely, \(n_l\rightarrow -\infty \)). In this limit, \(b=s_i=0\), and N can be calculated exactly from (14) to be

$$\begin{aligned} \lim _{|n_l| \rightarrow \infty } N=\frac{C_F}{\pi } \times e^{\frac{5}{6}}\,, \end{aligned}$$
(26)

which equals 0.97656 (\(C_F=4/3\) for \(N_c=3\)). This can be compared to evaluating (25) for \(n=4\) at \(\mu =\mu _m= \overline{m}\), which gives 0.971 in very good agreement with the exact result. According to (20), the dependence of the asymptotic behaviour on \(\mu \) and \(\mu _m\) is very simple since the logarithms of \(\mu \) in perturbation theory must exponentiate to powers asymptotically. The approximate determination of N is most accurate when choosing \(\mu \approx \overline{m}\) and exhibits a plateau around this value [20].

With this validation of the method, we consider the pole to \(\overline{\mathrm{MS}}\) mass series for the top, bottom and charm quark, corresponding to \(n_l=5,4,3\), respectively. For a detailed analysis of the top-quark case, see [20]. Setting \(\mu =\mu _m= \overline{m}_Q\) for \(Q=t,b,c\), one finds

$$\begin{aligned} N = 0.4606 \text{(top) },~0.5048 \text{(bottom) },~0.5366 \text{(charm) }\,,\nonumber \\ \end{aligned}$$
(27)

where the tiny difference relative to [20] is due to the inclusion of \(s_3\), which was not fully available at the time (lack of \(\beta _4\)). A conservative estimate of the uncertainty of these values from the independent variations of \(\mu \) and \(\mu _m\) is \(\pm 10\%\) (error symmetrized, see [20]), but the accuracy of the large-\(n_l\) result at \(\mu =\mu _m=\overline{m}\) suggests that it may be considerably smaller. It is worth noting that N is only half as large as the large-\(n_l\) result for physical values of \(n_l\), implying that the intrinsic ambiguity of the pole mass is smaller than inferred from the one-loop correction dressed by fermion loops.

To display the numerical properties of the series, I use the \(\overline{\mathrm{MS}}\) mass values \(\overline{m}_t=163.643\,\)GeV, \(\overline{m}_b=4.20\,\)GeV and \(\overline{m}_c=1.28\,\)GeV. The strong coupling is taken to be \(\alpha _s^{(5)}(m_Z)=0.1180\) at the scale \(m_Z=91.1876~\)GeV and evolved with five-loop accuracy to \(\overline{m}_Q\) including the flavour thresholds at \(2\mu _b = 9.6\,\)GeV and \(2\mu _c=3\,\)GeV with the help of RunDec [26, 27]. The series coefficients are then evaluated at \(\mu =\mu _m=\overline{m}_Q\) in an expansion in \(\alpha _s^{(5)}(\overline{m}_t)=0.1084\), \(\alpha _s^{(4)}(\overline{m}_b)=0.2246\), \(\alpha _s^{(3)}(\overline{m}_c)=0.3889\), for top, bottom and charm, respectively. The result is

$$\begin{aligned} m_t= & {} 163.643+7.531+1.606+0.494 + 0.194\nonumber \\&\quad + { 0.111 + 0.079} \nonumber \\&{ +0.066}+\mathbf{0.064}+{ 0.070+0.087}\nonumber \\&\quad +{ 0.119+0.178} +\cdots \,\mathrm{GeV} \end{aligned}$$
(28)
$$\begin{aligned} m_b= & {} 4.200+0.400+0.199+0.145 + \mathbf{0.135} \nonumber \\&\quad + { 0.177 + 0.284} \nonumber \\&{ +0.539+1.185}+\cdots \,\mathrm{GeV} \end{aligned}$$
(29)
$$\begin{aligned} m_c= & {} 1.280+0.211+\mathbf{0.202}+0.282 + 0.510 \nonumber \\&\quad + { 1.259 + 3.798}+\cdots \,\mathrm{GeV} \end{aligned}$$
(30)

for the series expansion of the mass conversion formula. In these expressions the first five numbers correspond to the five exactly known terms including the four-loop order, and the subsequent numbers in italics are obtained from the asymptotic formula (21). By construction, the asymptotic formula agrees with the exact one for the fifth term. The minimal term of the series is highlighted in bold face. The asymptotic formula corresponds to the “prediction”

$$\begin{aligned}&c_5 = 45.43 \text{(top }, n_l=5),~73.69 \text{(bottom }, n_l=4),\nonumber \\&\quad ~110.56 \text{(charm }, n_l=3) \end{aligned}$$
(31)

at \(\mu =\mu _m=\overline{m}_Q\) for the presently unknown five-loop conversion coefficient. The behaviour of the series is illustrated in Fig. 1, including an extrapolation of the asymptotic formula to all \(n>0\). It is apparent that it is already accurate at the three-loop order, \(n=3\).

Fig. 1
figure 1

Graphical representation of the series (28), (29), (30), exhibiting the typical dependence of the size of terms with order of a factorially divergent series. The (blue) circles for \(n<5\) represent exactly known terms, the (yellow) squares the asymptotic formula applied to all \(n>0\)

Comparing (28), (29), (30), we observe that the top mass series attains its smallest term at the eighth order in perturbation theory, far beyond the four-loop order currently known. On the other hand, the bottom series reaches its minimal term at this order, while the charm series starts to diverge from the two-loop order, which renders the charm pole mass of limited use for phenomenology. From a pragmatic point of view, the minimal term represents the ultimate accuracy beyond which the purely perturbative use of the pole quark mass ceases to be meaningful. The minimal term scales as \(\sqrt{\alpha _s(\overline{m}_Q)} \,\varLambda _\mathrm{QCD}\) and decreases with larger \(m_Q\), which reflects the fact that the minimum is shallower in this case. A renormalization-group invariant measure of the intrinsic limitations of the concept of the pole mass can be defined in terms of the ambiguity (10) of the Borel integral of the series,

$$\begin{aligned} \delta m_Q = \frac{\pi N}{|2\beta _0|}\times \varLambda _\mathrm{QCD}^{(n_f)}\,, \end{aligned}$$
(32)

exactly proportional to \(\varLambda _\mathrm{QCD}\), as it should be. Dividing by \(\pi \) gives a numerical value close to the minimal term for the top mass series, and this definition of ultimate accuracy has been adopted in [20].

Determining the non-perturbative normalization N of the leading pole mass renormalon singularity from matching to the highest known order is perhaps the simplest and most intuitive, but not the only method that has been suggested. I refer to [28,29,30,31,32,33,34,35] for other work, noting that earlier work did not have access to four-loop accuracy. As will be seen in Sec. 3.4 below, N is related to a similar leading renormalon constant of the static heavy quark potential by a factor \(-1/2\) [36,37,38]. Some of the quoted works apply this relation to infer N from an analysis of the series expansion of the colour-singlet Coulomb potential. The normalization constants are much more difficult to obtain when there is more than one constant, or an interference of sign-alternating UV and fixed-sign IR renormalon behaviour, as is the case for generic observables. In this case, one can resort to simplified parameterizations of the Borel transform as was done for \(\tau \)-decay spectral moments [39], but the level of rigour and precision that has been achieved for the pole mass of a heavy quark is unmatched by any other series in QCD.

2.5 Internal quark mass effects

The analysis assumed up to now that the lighter quarks are massless. In low orders of perturbation theory, this is often a good approximation, especially for the top quark pole mass, where \(m_c, m_b\ll m_t\). (We do not consider the effect of the up, down and strange quark mass and always neglect quark masses smaller than \(\varLambda _\mathrm{QCD}\).) However, in the regime where the series is dominated by the leading renormalon divergence, the typical loop momentum at order \(\alpha _s^{n+1}\) is of order \(m_Q e^{-n}\). Internal quark mass effects from the bottom and charm quark are expected to become important in higher orders. The minimal term of the series is attained when the typical loop momentum is of order \(\varLambda _\mathrm{QCD}\). At this scale the theory is a theory of three massless flavours, independent of whether the heavy quark was the top, bottom or charm quark. Hence, the true large-n behaviour of the series beyond the minimal terms is always determined by the \(n_l=3\) result, and likewise the ambiguity (32) should involve the \(\varLambda \)-parameter \(\varLambda _\mathrm{QCD}^{(3)}\) in the three-flavour scheme, always excluding the bottom and charm quark, independent of \(n_l\).

The decoupling of internal quarks with masses \(m_q\) larger than \(\varLambda _\mathrm{QCD}\) in the renormalon asymptotic behaviour was studied analytically and numerically in the large-\(n_l\) limit [40], and the described behaviour has been demonstrated. More precisely, the analysis showed that the asymptotic behaviour of the series in a theory with \(n_l\) quarks of which \(n_m\) are massive, approaches the series of the theory with \(n_l-n_m\) massless quarks when both are expressed in terms of the \(\overline{\mathrm{MS}}\) coupling \(\alpha _s^{(n_l-n_m)}(m_Q)\) in the \(n_l-n_m\) flavour scheme. However, as noted above the large-\(n_f\) limit overestimates the normalization of the leading renormalon by about a factor of two. Furthermore, one is rarely interested in the formal large-n behaviour of the series beyond the minimal term, but rather in the approach to it. At such intermediate orders, the typical loop momentum crosses the flavour thresholds, as the order of perturbation theory increases, and the internal quark masses are neither negligible nor decoupled.

The issue is especially relevant for the top quark, since the masses of the bottom and charm quark are too small in relation to \(m_t\) to express the entire series in terms of the four- or three-flavour coupling. In contrast, one may argue [32] that the bottom pole to \(\overline{\mathrm{MS}}\) mass conversion factor should be expressed in terms of \(\alpha _s^{(3)}(m_b)\) rather than the four-flavour coupling \(\alpha _s^{(4)}(m_b)\). For the two- and three-loop coefficients, for which the mass dependence is known [41,42,43], this substitution indeed renders the charm mass effect almost negligible. A quantitative investigation of bottom and charm mass effects on the top pole mass series was undertaken in [20, 44]. The following discussion is adapted from [20].

I first recall that the numerical series (28), (29), (30) for \(m_Q\) include internal loops of Q, but are expressed in terms of \(\alpha _s^{(n_l)}(\overline{m}_Q)\), where \(n_l\) is the number of massless quarks, including bottom and charm for the case of \(Q=\,\)top (\(n_l=5\)). To estimate the effect of the finite bottom and charm mass, we switch from the five- to the four-flavour scheme at the order, where the typical internal loop momentum is of order \(m_b\), which is \({{\mathcal {O}}}(\alpha _s^5)\), and from the four- to the three-flavour scheme at \({{\mathcal {O}}}(\alpha _s^6)\). Since the mass effect is not known for \(c_4\) at the four-loop order, and since \(c_n\) beyond the four-loop order can only be estimated assuming dominance of the first renormalon (as done above), this implies the following procedure: (a) at two and three loops, we include the known mass dependence, but \(c_4\) is approximated by the massless value. For given top \(\overline{\mathrm{MS}}\) mass, this increases the top pole mass by 11 (2-loop) + 16 (3-loop) MeV, adopting \(\overline{m}_b=4.2\) GeV and \(\overline{m}_c=1.28\) GeV, out of which 8.1 (2-loop) + 11.2 (3-loop) MeV are due to the finite bottom mass, and 2.4 (2-loop) + 4.6 (3-loop) from charm.Footnote 5 Since the \(c_n\) increase as \(n_l\) decreases, the mass effect is also expected to be positive in higher orders. Hence approximating \(c_4\) by its massless value underestimates the mass effect. (b) At the five-loop order, we use \(c_5^{(\mathrm as)} [\alpha _s^{(4)}(\overline{m}_t)]^5\) with \(c_5^{(\mathrm as)}\) determined by matching to the exactly known four-loop coefficient for \(n_l=4\), that is with normalization \(N_m=0.5048\) and beta-function coefficients for the four-flavour theory. (c) Beyond five loops, the remainder of the series is computed with the three-flavour scheme coupling \(\alpha _s^{(3)}(\overline{m}_t)\) and normalization \(N_m=0.5366\). Since the bottom and charm quarks are not yet completely decoupled at the five- to seven-loop order, and since an extra quark flavour decreases the \(c_n\), we expect that (b) and (c) overestimate the mass effect, since the approximation assumes that bottom and charm are already decoupled completely. The sum of (b) and (c) adds another 54 MeV to the top pole mass, such that the total mass effect is estimated to be 80 MeV. Explicitly, the series (28) is modified to

$$\begin{aligned} m_t= & {} 163.643+7.531+1.616+0.510 + 0.194\nonumber \\&+ { 0.140 + 0.106} \nonumber \\&+\,\mathbf{0.094} + { 0.096 +0.112+0.145}\nonumber \\&+{ 0.209+0.329} +\cdots \,\mathrm{GeV}, \end{aligned}$$
(33)

where the increasing importance of finite-quark mass effects with order is evident. In case of the top quark pole mass, the decoupling of the bottom and charm quark in internal loops increases the intrinsic uncertainty of the pole mass concept by almost 50% due to the more rapid divergence of the series in the three-massless flavour theory. Note that this ambiguity is independent of the precise value of the bottom and charm mass, as long as \(m_b, m_c \gg \varLambda _\mathrm{QCD}\). This also implies that it is the same for any heavy quark, including the bottom quark, since it depends only on the infrared properties of the theory, which is QCD with three approximately massless flavours.

Since the bottom quark is neither heavy enough to be decoupled in low orders, nor light enough to be ignored, where in both cases a massless approximation can be justified, there is an inherent uncertainty in the above estimate. However, as argued above, the errors in the approximations are expected to go in opposite directions, hence we consider \((80\pm 30)\) MeV a conservative estimate of the total internal bottom and charm quark mass effect on the top pole mass. The 30 MeV error estimate arises from an estimate of the neglected mass effect on \(c_4\) by extrapolation from the known lower orders. The approximation described here has been checked to work well in models for the series inspired by the large-\(n_l\) limit.

2.6 Finite width

With the electroweak interaction turned on, the heavy quarks become unstable. In perturbation theory, the pole of the heavy-quark propagator is shifted to

$$\begin{aligned} m_*^2 = m^2-i m\varGamma (m)\,, \end{aligned}$$
(34)

which defines the pole mass m and on-shell decay width \(\varGamma \). Unlike the quark mass, the width is not a parameter of the Standard Model (SM)—for heavy quarks it can be computed in perturbation theory in terms of m and other SM parameters.

The width is negligibly small compared to m except for the top quark, where \(\varGamma _t \approx 1.4~\)GeV. The large width \(\varGamma _t \gg \varLambda _\mathrm{QCD}\) does not eliminate the renormalon divergence of the top pole mass, as was emphasized in [45]. This does not mean that the large width is not relevant, since it does provide a cut-off on IR effects for physical observables. For example, measurements on jets containing top quarks are generically linearly sensitive to \(\varLambda _\mathrm{QCD}\) and accordingly display a strong renormalon divergence, which can be screened by the sizeable width [46]. In effect, as is intuitive, there is simply no quantity, for which the pole mass of a quark is ever the relevant parameter, once the quark’s width is larger than \(\varLambda _\mathrm{QCD}\).

Interestingly, the on-shell width of a quark, \(\varGamma (m)\), is itself an observable, which is less sensitive to IR physics than the pole mass. When the final state masses can be neglected, \(\varGamma (m) \propto G_F^2 m^5\), where \(G_F\) denotes the Fermi constant. The leading power corrections are of relative order \((\varLambda _\mathrm{QCD}/m)^2\). However, when the series is expressed in terms of the pole mass, an IR renormalon divergence indicating an ambiguity of linear order \(\varLambda _\mathrm{QCD}/m\) appears in the series of QCD corrections to the tree-level width. This ambiguity is spurious and a consequence of using a parameter with stronger IR sensitivity than the observable itself. Once the width of the quark is expressed in terms of the \(\overline{\mathrm{MS}}\) mass or (better) another leading renormalon-free mass definition such as will be discussed in Sect. 3, the leading renormalon is cancelled [47], and the series of loop corrections shows a much better behaviour. This is particularly important for the decay width of the bottom and charm quark, for which \(\varGamma \) could otherwise be obtained only with large uncertainty.

2.7 Beyond the leading renormalon

Much less is known about the renormalon singularities of the pole mass series beyond \(t=-1/(2\beta _0)\). On general grounds one expects a sign-alternating UV renormalon divergence from a singularity at \(t=1/\beta _0\), and an IR renormalon singularity at \(t=-1/\beta _0\) related to the \(\varLambda _\mathrm{QCD}^2/m\) kinetic-energy correction to the meson mass (2). The Borel transform of the series is known exactly in the large-\(n_f\) limit [8]. Table 11 in [10] displays the breakdown of the nth order term into the contributions from the first three IR renormalon and the first UV renormalon poles, and the \(\overline{\mathrm{MS}}\) subtraction terms. At least in the large-\(n_f\) approximation, the subleading poles never contribute more than one per mille of the dominant asymptotics from \(t=-1/(2\beta _0)\) for n beyond the four-loop order. For practical purposes, dealing with the leading singularity appears to be enough.

Curiously, the Borel transform in the large-\(n_f\) limit does not exhibit the expected next IR renormalon singularity at \(t=-1/\beta _0\). The authors of [8] speculated that Lorentz invariance might forbid a quadratic power-divergent mixing of the kinetic energy operator \(\bar{h}_v (iD_\perp )^2 h_v\) into \(\bar{h}_v h_v\), in which case there would be no matrix element to compensate the ambiguity of the Borel transform from \(t=-1/\beta _0\), and hence it should be absent. Invoking the virial theorem of HQET, the power-divergent mixing of the kinetic energy operator was related to the one of \(\bar{h}_v i g_s G^{\mu \nu } h_v\) into \(\bar{h}_v h_v\) [48]. This work confirms that Lorentz invariance forbids one-loop mixing, which explains the absence of the \(t=-1/\beta _0\) singularity in the large-\(n_f\) limit, but also showed that there is no reason for this to hold beyond this limit. Recent investigations [49] of a remainder series with the leading renormalon subtracted show sign alternation more fitting to UV renormalon behaviour. The question whether the subleading renormalon singularity at \(t=-1/\beta _0\) is absent or simply suppressed by a loop factor, is therefore still undecided.

3 Renormalon-free “on-shell” masses

Many important properties of heavy quarks are less IR sensitive than the pole quark mass itself—for example, the inclusive decay width discussed above, or the production cross section of a heavy quark-antiquark pair. Their perturbative expansions do not display an IR renormalon singularity at \(t=-1/(2\beta _0)\), leading to rapid divergence, provided they are not expressed in terms of the pole mass. In other words, although the pole mass is IR finite, it is for many purposes not a useful renormalized mass parameter. Instead, one should use a renormalization convention that is not only IR finite but also insensitive to the IR at least at linear order \(\varLambda _\mathrm{QCD}\).

The \(\overline{\mathrm{MS}}\) definition suggests itself. However, in physical systems where heavy quarks are nearly on-shell and have primarily soft fluctuations, the \(\overline{\mathrm{MS}}\) mass is not the appropriate choice. Being essentially a bare object, it does not include the short-distance fluctuations, which should have been integrated out to describe soft heavy-quark systems. In practice, this means that the \(\overline{\mathrm{MS}}\) mass value is too far away (by \(\mathcal {O}(\overline{m}_Q\alpha _s)\)) from the pole of the heavy-quark propagator. While the spurious pole mass renormalon is eliminated and the asymptotic behaviour improved, there still appear large corrections in low orders. It does not help to evolve the \(\overline{\mathrm{MS}}\) mass \(m(\mu )\) to scales \(\mu < \overline{m}\), since the \(\overline{\mathrm{MS}}\) quark-mass anomalous dimension applies only to the logarithmic evolution above \(\overline{m}\).

The resolution to the problem consists in quark mass concepts that are numerically closer to the pole mass, yet are constructed such that their perturbative relation to the \(\overline{\mathrm{MS}}\) mass is free from the leading IR renormalon. This has several benefits: 1) The concept is unambiguous, at least up to \(\mathcal {O}(\alpha _s\varLambda _\mathrm{QCD}^2/m)\), which is sufficient for practical purposes. 2) Such masses can be determined accurately from measurements or lattice calculations, and 3) they can be precisely related to the \(\overline{\mathrm{MS}}\) mass. 4) The impact of light internal quark mass effects is reduced, since the leading IR loop momentum contributions have been removed. The \(\overline{\mathrm{MS}}\) mass is then the convenient reference parameter (similar to \(\alpha _s(m_Z)\) for the strong coupling) to which different leading renormalon-free, “on-shell” mass definitions can be related.

3.1 General considerations

We start from the observation that the asymptotic behaviour (20) of the pole to \(\overline{\mathrm{MS}}\) mass conversion (3) has a very simple, exact linear dependence on the coupling renormalization scale \(\mu \), which follows on very general grounds [17], as well as on \(\mu _m\), which appears only through \(m(\mu _m)\). The asymptotic coefficients \(\tilde{c}_{n+1}^{(\mathrm as)}\) themselves in (21) are \(m(\mu _m)\), \(\mu \) and \(\mu _m\) independent. Since the Borel-integral ambiguity of the asymptotic series is always \(\varLambda _\mathrm{QCD}\),Footnote 6 we can replace \(m(\mu _m)\) by another scale \(\mu _f\). We therefore define

$$\begin{aligned} \delta m_X(\mu _f)= & {} \mu _f \sum _{n=1}^\infty s_{n}^X(\mu /\mu _f) \,\alpha _s^{n}(\mu ) \nonumber \\= & {} \mu _f \sum _{n=1}^\infty s_{n}^X\,\alpha _s^{n}(\mu _f)\,. \end{aligned}$$
(35)

The series coefficients \(s_n^X(\mu /\mu _f)\) are polynomials of order \(n-1\) in \(\ln (\mu /\mu _f)\) and must be chosen to satisfy

$$\begin{aligned} s^X_{n}(\mu /\mu _f)&\underset{n\rightarrow \infty }{\longrightarrow }N \frac{\mu }{\mu _f}\,\tilde{c}^{(\mathrm as)}_{n}\,, \end{aligned}$$
(36)

where N and \(\tilde{c}^{(\mathrm as)}_{n}\) are exactly the same as for the coefficients in the pole to \(\overline{\mathrm{MS}}\) mass relation (20) and (21), respectively.Footnote 7 Once such \(s_n^X\) have been found, we can define a leading renormalon-free, “short-distance” mass \(m_X(\mu _f)\) by subtracting \(\delta m_X(\mu _f)\) from the pole mass m:

$$\begin{aligned} m_X(\mu _f)\equiv & {} m- \delta m_X(\mu _f) \nonumber \\= & {} m(\mu _m) + \sum _{n=1}^\infty \left[ m(\mu _m) \,c_{n}(\mu ,\mu _m,m(\mu _m))\right. \nonumber \\&\quad \left. - \mu _f \,s_{n}^X(\mu /\mu _f)\right] \alpha _s^{n}(\mu ) \,.\qquad . \end{aligned}$$
(37)

By construction the leading IR renormalon divergence of the series cancels in the square bracket. This in turn guarantees that the series that relates \(m_X(\mu _f)\) to the \(\overline{\mathrm{MS}}\) mass \(m(\mu _m)\) is well behaved (no leading IR renormalon divergence).

The new scale \(\mu _f\) should be chosen such that \(\varLambda _\mathrm{QCD}\ll \mu _f\ll m\). The first equality is required for perturbativity. The second guarantees that the difference between the pole mass and \(m_X(\mu _f)\) is only of order \(\mu _f\,\alpha _s\) and can be made sufficiently small to avoid the problem with the \(\overline{\mathrm{MS}}\) mass (where \(\mu _f \sim m\)) discussed above. A common feature of all renormalon-free quark mass definitions suitable for the description of nearly on-shell heavy-quark physics is therefore the existence of a new “subtraction scale” \(\mu _f\) and a linear dependence on this scale. This reflects that the running of the quark mass changes from logarithmic to linear below the scale m in accordance with the fact that the self-energy of a point charge is linearly divergent in the static or non-relativistic regime and turns logarithmic only when the anti-particle fluctuations become relevant in the relativistic theory.

The leading renormalon-free masses satisfy a simple renormalization group equation in the subtraction scale \(\mu _f\), which may be used to relate \(m_X(\mu _f)\) at different scales \(\mu _{f2}\), \(\mu _{f1}\), when logarithms of \(\mu _{f2}/{\mu _{f1}}\) might have to be summed. Defining the anomalous dimension \(\gamma _X(\alpha _s)\) through

$$\begin{aligned} \gamma _X(\alpha _s) = - \frac{d m_X(\mu _f)}{d\mu _f}\,, \end{aligned}$$
(38)

the general form (35) of the subtraction term yields

$$\begin{aligned} \gamma _X(\alpha _s(\mu _f))= & {} \sum _{n=1}^\infty s_{n}^X\,\left[ \alpha _s^{n}(\mu _f) + 2 n \alpha _s^{n-1}(\mu _f)\, \beta (\alpha _s(\mu _f))\right] \nonumber \\= & {} s_1^X \alpha _s(\mu _f) +[s_2^X+2 s_1^X\beta _0]\, \alpha _s^{2}(\mu _f)+\cdots \,.\nonumber \\ \end{aligned}$$
(39)

In the following, I discuss several suitable mass definitions with no claim to completeness. With respect to (2), note that any definition of \(\delta m_X(\mu _f)\) automatically yields an unambiguous, renormalon-free definition \({{\bar{\varLambda }}}_X(\mu _f)\) of the \({{\bar{\varLambda }}}\) parameter that appears in the heavy-quark expansion of the heavy meson mass and many other HQET expressions by the rearrangement

$$\begin{aligned} M_Q=[\underbrace{m_Q-\delta m_X(\mu _f)}_{m_X(\mu _f)}]+ [\underbrace{{\bar{\varLambda }}+\delta m_X(\mu _f)}_{{\bar{\varLambda }}_X(\mu _f)}] + \cdots \,.\nonumber \\ \end{aligned}$$
(40)

This can be turned around: any renormalon-free definition of the HQET parameter \({\bar{\varLambda }}\) can be turned into a renormalon-free, “short-distance”, on-shell mass definition.

3.2 RS mass

The renormalon subtracted (RS) mass definition [28] is the first of two schemes, which implement the condition (36) in a very direct way. Namely, for RS, one simply defines the \(s_n^\mathrm{RS}\) to equal the asymptotic coefficients, that is,

$$\begin{aligned} s_{n}^\mathrm{RS}(\mu /\mu _f) = N \frac{\mu }{\mu _f}\,\tilde{c}^{(\mathrm as)}_{n}\,. \end{aligned}$$
(41)

While this expression could be used for any \(\mu /\mu _f\), the implementation proposed in [28] first assumes \(\mu =\mu _f\), in which case

$$\begin{aligned} \delta m_\mathrm{RS}(\mu _f) = \mu _f N \sum _{n=1}^\infty \tilde{c}^{(\mathrm as)}_{n}\, \alpha _s^{n}(\mu _f)\,, \end{aligned}$$
(42)

which by construction subtracts the leading renormalon divergence of the series (3), and then replaces \(\alpha _s(\mu _f)\) by its series expansion in \(\alpha _s(\mu )\), where \(\mu \) is the scale at which the pole to \(\overline{\mathrm{MS}}\) series is evaluated. For the bottom and charm mass, the effectiveness of this subtraction is analyzed in detail in [32], which also discusses variants of this definition.

A drawback of the RS mass definition is that it needs a precise determination of the normalization N, which depends on the method employed and further on the number of light flavours, see (27). To fully define the RS mass, one needs to provide the order to which \(\tilde{c}^{(\mathrm as)}_{n}\) is included according to (21), and specify the value of N.

3.3 MSR mass

Another simple realization of the general subtraction condition (36) that avoids the drawback of the RS mass definition is to set the \(s_n\) equal to \(c_n\) and simply replace \(m(\mu _r)\) in (20) by the subtraction scale \(\mu _f\) [31, 34]. More precisely, for \(\mu =\mu _f\), which is assumed here, we define

$$\begin{aligned} s_{n}^\mathrm{MSR}{}_{|\mu =\mu _f} = c_n(\overline{m},\overline{m},\overline{m})\,, \end{aligned}$$
(43)

where the pole to \(\overline{\mathrm{MS}}\) mass conversion coefficients are evaluated at \(\mu =\mu _r=m(\mu _r)=\overline{m}\), in which case they are pure numbers. This gives the “practical version” [34] of the MSR mass definition

$$\begin{aligned} \delta m_\mathrm{MSR}(\mu _f) = \mu _f \sum _{n=1}^\infty c_{n}(\overline{m},\overline{m},\overline{m}) \,\alpha _s^{n}(\mu _f)\,. \end{aligned}$$
(44)

The requirement (36) is satisfied since according to (20)

$$\begin{aligned} s_{n}^\mathrm{MSR}{}_{|\mu =\mu _f} = c_n(\overline{m},\overline{m},\overline{m})&\underset{n\rightarrow \infty }{\longrightarrow }N \,\tilde{c}^{(\mathrm as)}_{n}\,. \end{aligned}$$
(45)

The MSR mass subtraction is straightforward to implement, once the pole to \(\overline{\mathrm{MS}}\) mass conversion coefficients are given. The MSR mass interpolates between \(\overline{m}\) for \(\mu _f=\overline{m}\) and the pole mass for \(\mu _f=0\), although the latter limit cannot be taken as the coupling \(\alpha _s(\mu _f)\) flows into the strong-coupling regime. The efficiency of the MSR mass subtraction is analyzed in detail in [34].

Both the RS and MSR mass satisfy a simple renormalization group equation in the subtraction scale, as discussed above. As for \(\mu _f=\overline{m}\), the MSR mass equals the \(\overline{\mathrm{MS}}\) mass \(\overline{m}\), the relation between \(m_\mathrm{MSR}(\mu _f)\) and \(\overline{m}\) can be obtained conveniently by solving the RGE equation, see [34]. Alternatively, as for the RS scheme, one can replace \(\alpha _s(\mu _f)\) in (44) by its series expansion in \(\alpha _s(\mu )\), where \(\mu \) is the scale at which the pole to \(\overline{\mathrm{MS}}\) series is evaluated, as long as \(\ln (\mu /\mu _f)\) is small enough not to require resummation.

3.4 PS mass

The potential-subtracted (PS) mass [36] is the first of two renormalon-free, short-distance, on-shell masses, which are motivated and defined in terms of another physical quantity than the pole mass. A non-relativistic system of heavy quark and anti-quark in a colour-singlet configuration experiences an attractive potential force, whose leading term is the Coulomb potential. In momentum space,

$$\begin{aligned} \tilde{V}(\varvec{{q}}) = -\frac{4\pi C_F}{\varvec{{q}}^{\,2}}\, v_c(\alpha _s(\mu ),q/\mu )\,, \end{aligned}$$
(46)

where \(q=|\varvec{{q}}|\) and \(v_c=\alpha _s+\cdots \) incorporates the loop corrections to the tree-level potential.

The PS scheme is based on the observation that there is a cancellation of the leading divergent series behaviour in the combination \(2 m+[V(r)]_\mathrm{Coulomb}\). This can be seen explicitly at the one-loop order and in the large-\(\beta _0\) approximation [36,37,38], and by a diagrammatic argument at two loops [36] and beyond. The cancellation expresses the fact that while the separation of the total energy of the quarkonium-like system into the quark pole masses and binding energy is ambiguous (as was the case for \(m+\bar{\varLambda }\) for a heavy-light system), the total energy is physical and unambiguous. The PS mass at subtraction scale \(\mu _f\) is defined by

$$\begin{aligned} m_\mathrm{PS}(\mu _f)= & {} m+\frac{1}{2} \int \limits _{|\varvec{{q}}\,|<\mu _f} \frac{d^3 \varvec{{q}}}{(2\pi )^3}\,\tilde{V}(\varvec{{q}})\,, \end{aligned}$$
(47)

which removes the leading IR contributions to the self-energy from \(q=|\varvec{{q}}|<\mu _f\).

The series expansion of \(v_c(\alpha _s(\mu ),q/\mu )\) appearing in the Coulomb potential (46) is conventionally written in the form

$$\begin{aligned}&v_c(\alpha _s(\mu ),q/\mu ){}_{|\mu =q} = \alpha _s(q)+ \sum _{n=1}^\infty a_n\left( \frac{\alpha _s(q)}{4\pi }\right) ^{n+1}\nonumber \\&\quad +\left( \frac{\alpha _s(q)}{4\pi }\right) ^{3} 8\pi ^2 C_A^3 \ln \frac{\nu ^2}{\varvec{{q}}^2} + \cdots \end{aligned}$$
(48)

The coefficients \(a_{1,2,3}\) are known, where \(a_3\) refers to the three-loop colour-singlet Coulomb potential [50,51,52]. I note that the potential here is not defined in terms of a Wilson loop, but as a matching coefficient [53] to potential non-relativistic QCD (PNRQCD) [54, 55], defined with minimal subtraction. The last term in (48) is the first of an infinite series of terms, which contains an explicit dependence on the factorization or PNRQCD matching scale \(\nu \) (to be distinguished from \(\mu \)), which arises from an IR divergence related to the ultrasoft scale.

Up to the third order to which the potential is currently known, performing the integration over \(\varvec{{q}}\) yields [36, 56]Footnote 8

$$\begin{aligned}&\delta m_\mathrm{PS}(\mu _f)\nonumber \\&\quad = -\frac{1}{2}\int _{q \le \mu _f} \frac{d^3 {\varvec{{q}}}}{(2\pi )^3}\, \tilde{V}(\varvec{{q}}) \nonumber \\&\quad = \frac{\mu _f C_F\alpha _s(\mu )}{\pi } \Bigg [1 + \frac{\alpha _s(\mu )}{4\pi } \Big (2 \beta _0\,l_1 +a_1\Big ) \nonumber \\&\qquad + \,\bigg (\frac{\alpha _s(\mu )}{4\pi }\bigg )^{2} \bigg ( 4 \beta _0^2\,l_2+2\,\Big (2 a_1\beta _0+\beta _1\Big )l_1 + a_2 \bigg ) \nonumber \\&\qquad + \,\bigg (\frac{\alpha _s(\mu )}{4\pi }\bigg )^{3} \bigg ( 8 \beta _0^3 l_3 + 4 \Big (3 a_1\beta _0^2+\frac{5}{2}\beta _0\beta _1\Big ) l_2\nonumber \\&\qquad +2 \Big (3 a_2\beta _0+2 a_1\beta _1+\beta _2\Big ) l_1 \nonumber \\&\qquad + \,a_3 + 16 \pi ^2 C_A^3 \left[ \ln \frac{\nu }{\mu _f}+1\right] \bigg ) \Bigg ], \end{aligned}$$
(49)

where

$$\begin{aligned} l_1= & {} \ln (\mu /\mu _f)+1\,, \nonumber \\ l_2= & {} \ln ^2(\mu /\mu _f)+2\ln (\mu /\mu _f)+2\,, \nonumber \\ l_3= & {} \ln ^3(\mu /\mu _f)+3 \ln ^2(\mu /\mu _f)+6\ln (\mu /\mu _f)+6\,.\nonumber \\ \end{aligned}$$
(50)

To fully specify the PS mass definition, a value of \(\nu \) must be chosen and the standard value is \(\nu =\mu _f\), which sets the logarithm \(\ln (\nu /\mu _f)\) to zero in the last line. This still leaves a constant term \(16\pi ^2 C_A^3 =4263.67...\), which is large compared to \(a_3=1461.32...\) (quoted for \(n_l=5\)). Since the former is related to an ultrasoft rather than potential effect, another well-motivated choice is \(\nu =\mu _f e^{-1}\), which nullifies the entire square bracket in the last line, resp. the extra term in (48) at this accuracy. I will refer to this choice as the PS\(^*\) definition. The difference between the PS and PS\(^*\) mass is largely irrelevant when the expanded formula (49) is used, since the nth order term is dominated by the \(l_i\) terms, originating from the running coupling in lower order terms, for relevant values of \(\mu _f\). On the other hand, the difference affects directly the size of the last presently known term in the anomalous dimension and RGE below.

It is instructive to interpret \(v_c(q)=v_c(\alpha _s(\mu ),q/\mu ){}_{|\mu =q}=\alpha _s(q) +\cdots \) as an effective coupling, and write the mass subtraction term as

$$\begin{aligned} \delta m_\mathrm{PS}(\mu _f) = \frac{C_F}{\pi } \int _0^{\mu _f} dq\,v_c(q)\,. \end{aligned}$$
(51)

The \(\mu _f\) evolution of the PS mass is governed by the anomalous dimension

$$\begin{aligned} \gamma _\mathrm{PS}(\alpha _s(\mu _f)) = - \frac{d m_\mathrm{PS}(\mu _f)}{d\mu _f} = \frac{C_F}{\pi } v_c(\mu _f)\,. \end{aligned}$$
(52)

The solution is evidently (see (51))

$$\begin{aligned} \Big [\delta m_\mathrm{PS}(\mu _f)\Big ]^{\mu _{f2}}_{\mu _{f1}} = \frac{C_F}{\pi } \int ^{\mu _{f2}}_{\mu _{f1}} dq\,v_c(q)\,, \end{aligned}$$
(53)

which allows us to compute \(m_\mathrm{PS}(\mu _f)\) at widely different scales \(\mu _{f1},\mu _{f2}\) without large logarithms by integrating the expansion of \(v_c(q)\) in terms of the running \(\overline{\mathrm{MS}}\) coupling \(\alpha _s(q)\). The leading-logarithmic solution can be expressed in terms of the exponential-integral function.

The PS scheme is especially well suited for quarkonium-like systems including open \(Q {\bar{Q}}\) systems near threshold, since it subtracts the leading IR contributions explicitly already in low orders of perturbation theory. Prime examples are quarkonium masses at next-to-next-to-next-to-leading order (NNNLO) [56], the determination of the bottom-quark mass from high moments of the pair production cross section at second [57] and third order [58] in PNRQCD, and in particular precision calculations of top-quark pair production near threshold to NNLO and NNNLO [55, 59] in the PS scheme. For \(Q\bar{Q}\) systems near threshold, the scale \(\mu _f\) should be chosen parametrically of order \(m v\sim m \alpha _s\) such that \(\delta m_\mathrm{PS} \sim m v^2\), in order not to violate the power counting of the non-relativistic expansion. With this choice, the relation (47) is already accurate to order \(m \alpha _s^5\).

3.5 Kinetic mass

The kinetic mass scheme is another physical scheme, in this case related to the physics of semi-leptonic decays of heavy-light mesons [60, 61]. The pseudoscalar B meson mass has the heavy-quark expansion (cf. (2))

$$\begin{aligned} m_B = m_b+\bar{\varLambda }+\frac{\mu _\pi ^2-\mu _G^2}{2 m_b}+\cdots , \end{aligned}$$
(54)

where \(\mu _\pi ^2\) and \(\mu _G^2\) are the B-meson matrix elements of the kinetic energy and chromo-magnetic operators, respectively. The kinetic mass can be understood as a perturbative evaluation of this formula, in which the matrix elements include loop momentum integration regions below the scale \(\mu _f\):

$$\begin{aligned} m_\mathrm{kin}(\mu _f)= & {} m \underbrace{- [\bar{\varLambda }(\mu _f)]_\mathrm{pert} - \left[ \frac{\mu _\pi ^2(\mu _f)}{2 m_\mathrm{kin}(\mu _f)}\right] _\mathrm{pert}}_{-\delta m_\mathrm{kin}(\mu _f)} +\cdots \nonumber \\ \end{aligned}$$
(55)

The matrix elements on the right-hand side subtract the long-distance sensitive contributions to the pole mass order by order in \(\mu _f/m_Q\) and \(\alpha _s\). Comparing to (40), we note that the kinetic mass definition not only subtracts the leading IR renormalon divergence through \( [\bar{\varLambda }(\mu _f)]_\mathrm{pert}\), but also the IR sensitivity at subleading power \(\varLambda _\mathrm{QCD}^2/m\).

Up to now the discussion has been general. The kinetic scheme is defined by providing a concrete prescription for calculating the perturbative matrix elements in terms of the perturbative evaluation of short-distance observables. It relies on the fact that the \({\bar{\varLambda }}\) parameter and kinetic-energy matrix element appear in the heavy-quark expansion of the dilepton-differential spectrum of inclusive semi-leptonic \(B\rightarrow X_c\ell \bar{\nu }\) decays [62, 63], which can be constructed from the imaginary part of the two-point function of the \(b\rightarrow c\) transition current. The convention for the kinetic mass used in the literature employs an indirect definition of the matrix elements through heavy flavour sum rules [60, 61] in the small-velocity limit, which is rather complicated when compared with the other three mass definitions above. The central quantity is the forward amplitude

$$\begin{aligned} T(q)= & {} \frac{i}{2m_Q} \int \mathrm{d}^4{x \,} e^{-iqx} \langle Q|T J(x)J^\dagger (0)|Q\rangle \end{aligned}$$
(56)

and its discontinuity

$$\begin{aligned} W(q)= & {} 2 \,\text{ Im }\,[ T(q) ]\,. \end{aligned}$$
(57)

Since the perturbative matrix elements (55) to be computed are spin-independent and universal, instead of the physical V–A current, one can define the kinetic scheme by adopting the scalar current \(J=Q Q^\prime \) provided Q and \(Q^\prime \) are heavy. To simplify further, one can set \(Q^\prime = Q\) and compute the forward scattering of the heavy quark off the current J with momentum \(q=(q^0,\varvec{{q}})\) in the limit when \(\varvec{{v}}=\varvec{{q}}/m_Q\ll 1\). To isolate the IR region of the final state momenta, the total energy of the quark and gluon final state X excluding the heavy quark Q is restricted to

$$\begin{aligned} \omega \equiv q^0-\Big [\sqrt{m_Q^2+\varvec{{q}}^2}-m_Q\Big ] <\mu _f\,. \end{aligned}$$
(58)

Employing the variables \((\omega ,\varvec{{v}})\) instead of \((q^0,\varvec{{q}})\), the mass subtractions are defined in terms of the double limit of the first moments of the spectral function:

$$\begin{aligned} {[}\bar{\varLambda }(\mu _f)]_\mathrm{pert}= & {} \lim _{\varvec{{v}}\rightarrow 0}\lim _{m_Q\rightarrow \infty } \frac{2}{\varvec{{v}}\,^2} \frac{\displaystyle \int _0^{\mu _f} d\omega \,\omega \, W(\omega ,\varvec{{v}})}{\displaystyle \int _0^{\mu _f} d\omega \,W(\omega ,\varvec{{v}})}\,, \nonumber \\\end{aligned}$$
(59)
$$\begin{aligned} {[}\mu _\pi ^2(\mu )]_\mathrm{pert}= & {} \lim _{\varvec{{v}}\rightarrow 0}\lim _{m_b\rightarrow \infty } \frac{3}{\varvec{{v}}\,^2} \frac{\displaystyle \int _0^{\mu _f} d\omega \,\omega ^2 \, W(\omega ,\varvec{{v}})}{\displaystyle \int _0^{\mu _f} d\omega \,W(\omega ,\varvec{{v}})}\,. \nonumber \\ \end{aligned}$$
(60)
Table 1 Comparison of top quark mass definitions for \(\mu _f=20\,\)GeV for given \(\overline{\mathrm{MS}}\) mass \(\overline{m}_t\). \(\alpha _s(m_Z)=(0.1180\pm 0.0010)\)
Table 2 Comparison of bottom quark mass definitions for \(\mu _f=2\,\)GeV for given \(\overline{\mathrm{MS}}\) mass \(\overline{m}_b\). \(\alpha _s(m_Z)=(0.1180\pm 0.0010)\)
Table 3 Comparison of charm quark mass definitions for \(\mu _f=1\,\)GeV for given \(\overline{\mathrm{MS}}\) mass \(\overline{m}_c\). \(\alpha _s(m_Z)=(0.1180\pm 0.0010)\)

The demonstration that the right-hand sides of these equations can be identified with the subtracted \(\bar{\varLambda }\) and kinetic-energy parameters, which appear in the heavy-quark expansion of the meson mass, is given in [61]. It is sufficient to compute \(W(\omega ,\varvec{{v}})\) in an expansion in \(\omega , |\varvec{{q}}| \ll m_Q\) to order

$$\begin{aligned} W(\omega ,\varvec{{v}}) = W_\mathrm{virt}(\varvec{{v}}) \delta (\omega ) + \frac{\varvec{{v}}^2}{\omega } W_\mathrm{real} \varTheta (\omega ) +\mathcal {O}(\omega ^0,\varvec{{v}}^4)\,.\nonumber \\ \end{aligned}$$
(61)

The two-loop computation [64] has been known for some time, but the three-loop result has been obtained only recently [65, 66]. Different from the other three mass schemes, the \(\mathcal {O}(\alpha _s^4)\) term of \(\delta m_\mathrm{kin}(\mu _f)\) is presently not available.

The kinetic scheme is especially well suited for observables derived from semi-leptonic decays of heavy quarks since it subtracts the leading IR contributions explicitly already in low orders of perturbation theory. Initially, it was suggested to eliminate the leading renormalon divergence from the semi-leptonic width by replacing the pole mass by the \(\overline{\mathrm{MS}}\) mass [67], but this does not improve the behaviour in low orders. The comparison to the series convergence when renormalon-free on-shell masses are used (Table 2 of [68]) clearly shows the latter’s advantage. The kinetic scheme was used for bottom quark mass and \(|V_{cb}|\) determinations at second order [69]. Recently, the semi-leptonic decay width was calculated to NNNLO and the effectiveness of the kinetic mass scheme was demonstrated for the inclusive rate for the first time at this order [70].

3.6 Comparison

Once any of the leading renormalon-free, on-shell, short-distance masses has been determined from some observable, one is eventually interested in converting them to the \(\overline{\mathrm{MS}}\) reference mass \(\overline{m}\). Over the past 20 years the accuracy of the mass definitions and observables has improved by one order (typically from two-loop to three-loop, and three-loop to four-loop for the pole to \(\overline{\mathrm{MS}}\) mass series). It is, therefore, timely to update and extend the comparison [68] of different definitions (see also [25]).

The purpose of the following comparison is to display the good behaviour of the relation between the various subtracted masses and the \(\overline{\mathrm{MS}}\) mass contrary to the pole mass series (28), (29), (30). As above, the \(\overline{\mathrm{MS}}\) masses are fixed to \(\overline{m}_t=163.643\,\)GeV, \(\overline{m}_b=4.20\,\)GeV and \(\overline{m}_c=1.28\,\)GeV. The strong coupling is taken to be \(\alpha _s^{(5)}(m_Z)=0.1180\pm 0.0010\) at the scale \(m_Z=91.1876~\)GeV, and the series coefficients are evaluated at \(\mu =\mu _m=\overline{m}\) in an expansion in \(\alpha _s^{(5)}(\overline{m}_t)=0.1084\), \(\alpha _s^{(4)}(\overline{m}_b)=0.2246\), \(\alpha _s^{(3)}(\overline{m}_c)=0.3889\), for top, bottom and charm, respectively. The subtraction scale \(\mu _f\) is chosen to be \(\mu _f=(20,2,1)\,\)GeV for top, bottom and charm, equal for all mass definitions for the sake of comparison, even if different “canonical” values are often adopted for the different schemes. Internal mass effects are neglected, since they are not always known with comparable precision. In case of mass schemes originally defined in terms of \(\alpha _s(\mu _f)\), the series have been converted into expansions in \(\alpha _s(\overline{m}_Q)\) with the required four-loop accuracy.

I use private code for the RS, MSR and PS mass. The RS and PS scheme (for \(\nu =\mu _f\)) is also implemented in CRunDec3.1 [27] and the PS mass also in \(\mathtt {QQbar\_threshold}\) [71]. In case of the RS scheme I adapted the normalization constants \(N_m = 0.563\) (\(n_l=3\)), \(N_m = 0.547\) (\(n_l=4\)), \(N_m = 0.527\) (\(n_l=3\)) given in [32] (and hard-coded in CRunDec3.1) to the values from (27). Finally, the kinetic mass \(m_\mathrm{kin}\) is implemented for massless internal quarks with CRunDec3.1’s mMS2mKIN[\(\overline{m}_Q\), 0, 0, "[as]"*\(\alpha _s^{(n_l)}(\overline{m}_Q)\), \(\overline{m}_Q\), \(\mu _f\), \(n_l\), \(n_l\), 3, ""] call, presently available only to three-loop accuracy.

The results are summarized in Tables 1, 2, and 3. It is evident that in terms of the size of mass corrections up to the shown four-loop order, all mass schemes are rather similar with exception of the kinetic scheme. In all cases, one observes a spectacular improvement of convergence relative to the pole mass series given in the first line. One expects the cancellation of the leading renormalon to become more and more effective in higher orders measured relative to the order of the minimal term, which can be seen explicitly by comparing the top, bottom and charm tables. The effect is particularly dramatic for the charm mass, for which the pole mass series starts diverging beyond the two-loop order, while the renormalon-subtracted masses are still well behaved at the fourth order, with corrections in the few MeV range. The top pole mass series is still in the regime of decreasing coefficients at the four-loop order, albeit slowly, hence the relative improvement of the subtraction should increase in the next orders. The four-loop coefficient is typically \((40-50)\,\)MeV, but the size of the next unknown term can be assumed to be small enough relative to the experimental precision that can be attained in the future, even from the scan of the pair production cross section in \(e^+e^-\) collisions [72].

“Sum” in the last column of the tables refers to the sum of the terms up to the four-loop order shown, and the error attached quantifies the variation of the sum under a variation of \(\alpha _s(m_Z)\) by \(\pm 0.001\). It is apparent that once leading renormalon-free, on-shell masses are employed, the limitation of the accuracy of their relations to the \( \overline{\mathrm{MS}}\) mass is (currently) no longer determined by the convergence of the expansion, but by the precision of \(\alpha _s(m_Z)\). For the case of the top quark, this uncertainty is mainly caused by the large one-loop correction of a few GeV to be compared to the ultimate precision to which the subtracted masses can be obtained theoretically and (in principle) experimentally. Unlike the bottom and charm masses, to make use of this precision requires better knowledge of the strong coupling.