1 Setting the stage

If one performs a search on the topic of relativistic fluids on any of the major physics article databases one is overwhelmed by the number of “hits”. This reflects the importance that the fluid model has long had for physics and engineering. For relativistic physics, in particular, the fluid model is essential. After all, many-particle astrophysical and cosmological systems are the best sources of detectable effects associated with General Relativity. Two obvious examples, the expansion of the Universe and oscillations (or, indeed, mergers) of neutron stars, indicate the vast range of scales on which relativistic fluids are relevant. A particularly topical context for general relativistic fluids is their use in the modelling of gravitational-wave sources. This includes the compact binary inspiral problem, either involving two neutron stars or a neutron star and a black hole, the collapse of stellar cores during supernovae, or various neutron star instabilities. One should also not forget the use of (special) relativistic fluids in modelling collisions of heavy nuclei, astrophysical jets, and gamma-ray burst emission.

This review provides an introduction to the modeling of fluids in General Relativity. As the (main) target audience is graduate students with a need for an understanding of relativistic fluid dynamics we have made an effort to keep the presentation pedagogical, carefully introducing the central concepts. The discussion will (hopefully) also be useful to researchers who work in areas outside of General Relativity and gravitation per se (e.g., a nuclear physicist who develops neutron star equations of state), but who require a working knowledge of relativistic fluid dynamics.

Throughout (most of) the discussion we will assume that General Relativity is the proper description of gravity. From a conservative point of view, this restriction is not too severe. Einstein’s theory is extremely well tested and it is natural to focus our attention on it. At the same time, it is important to realize that the problem of fluids in other theories of gravity has interesting aspects. And perhaps more importantly, we know that General Relativity cannot be the ultimate theory of gravity—it absolutely breaks on the quantum scale and may also have trouble on the large scales of cosmology (taking the presence of the mysterious dark energy as evidence that something is missing in our understanding). As we hope that the review will be used by students and researchers who are not necessarily experts in General Relativity and the techniques of differential geometry, we have included an introduction to the mathematical tools required to build relativistic models. Our summary is not a proper introduction to General Relativity, but we have made an effort to define all the tools we need for the discussion that follows. Hopefully, our description is sufficiently self-contained to provide a less experienced reader with a working understanding of (at least some of) the mathematics involved. In particular, the reader will find an extended discussion of the covariant and Lie derivatives. This is natural since many important properties of fluids, both relativistic and non-relativistic, can be established and understood by the use of parallel transport and Lie-dragging, and it is vital to appreciate the distinction between the two. As we do not want to make the initial learning curve too steep, we have tried to avoid the language of differential geometry. This makes the discussion less “elegant” in places, but we feel that this is a price worth paying if the aim is to make the material more generally accessible.

Ideally, the reader should have some familiarity with standard fluid dynamics, e.g., at the level of the discussion in Landau and Lifshitz (1959), basic thermodynamics (Reichl 1984), and the mathematics of action principles and how they are used to generate equations of motion (Lanczos 1949). Having stated this, it is clear that we are facing a challenge. We are trying to introduce a topic on which numerous books have been written (e.g., Tolman 1987; Landau and Lifshitz 1959; Lichnerowicz 1967; Anile 1989; Wilson and Mathews 2003; Rezzolla and Zanotti 2013), and which requires an understanding of a significant fraction of modern theoretical physics. This does not, however, mean that there is no place for this kind of survey. We continue to see exciting developments for multi-constituent systems, such as superfluid/superconducting neutron star cores.Footnote 1 Much of the recent theory work has been guided by the geometric approach to fluid dynamics championed by Carter (1983, 1989a, 1992), which provides a powerful framework that makes extensions to multi-fluid situations intuitive. A typical example of a phenomenon that arises naturally is the so-called entrainment effect, which plays a crucial role in a superfluid neutron star core. Given the flexible nature of the formalism, its natural connection with General Relativity and the potential for future applications, we have opted to base much of our description on the work of Carter and colleagues.

It is important to appreciate that, even though the subject of relativistic fluids is far from new, issues still remain to be resolved. The most obvious shortcoming of the available theory concerns dissipative effects. As we will see, different dissipation channels are (at least in principle) easy to incorporate in Newtonian theory but the extension to General Relativity remains “problematic”. This is an issue—with a number of notable recent efforts—of key importance for future gravitational-wave source modelling (e.g., in numerical relativity) as well as the description of laboratory systems (like heavy-ion collisions). In order to develop the required framework, we need to make progress on both the underpinning theory and implementations (e.g., computationally “affordable” simulations)—a real, but at the same time inspiring, challenge.

1.1 A brief history of fluids

The two fluids air and water are essential to human survival. This obvious fact implies a basic need to divine their innermost secrets. Homo Sapiens have always needed to anticipate air and water behaviour under a myriad of circumstances, such as those that concern water supply, weather, and travel. The essential importance of fluids for survival—and how they can be exploited to enhance survival—implies that the study of fluids likely reaches as far back into antiquity as the human race itself. Unfortunately, our historical records of this ever-ongoing study are not so great that we can reach very far accurately.

A wonderful account (now in affordable Dover print) is “A History and Philosophy of Fluid Mechanics” by Tokaty (1994). He points out that while early cultures may not have had universities, government sponsored laboratories, or privately funded centers pursuing fluids research (nor a Living Reviews journal on which to communicate results!), there was certainly some collective understanding. After all, there is a clear connection between the viability of early civilizations and their access to water. For example, we have the societies associated with the Yellow and Yangtze rivers in China, the Ganges in India, the Volga in Russia, the Thames in England, and the Seine in France, to name just a few. We must also not forget the Babylonians and their amazing technological (irrigation) achievements in the land between the Tigris and Euphrates, and the Egyptians, whose intimacy with the flooding of the Nile is well documented. In North America, we have the so-called Mississippians, who left behind their mound-building accomplishments. For example, the Cahokians (in Collinsville, Illinois) constructed Monk’s Mound,Footnote 2 the largest pre-Columbian earthen structure in existence that is “...over 100 feet tall, 1000 feet long, and 800 feet wide (larger at its base than the Great Pyramid of Giza)”.

In terms of ocean and sea travel, we know that the maritime ability of the Mediterranean people was the key to ensuring cultural and economic growth and societal stability. The finely-tuned skills of the Polynesians in the South Pacific allowed them to travel great distances, perhaps reaching as far as South America, and certainly making it to the “most remote spot on the Earth”, Easter Island. Apparently, they were adept at reading the smallest of signs—water colour, views of weather on the horizon, subtleties of wind patterns, floating objects, birds, etc.—as indications of nearby land masses. Finally, the harsh climate of the North Atlantic was overcome by the highly accomplished Nordic sailors, whose skills allowed them to reach North America. Perhaps it would be appropriate to think of these early explorers as adept geophysical fluid dynamicists/oceanographers?

Many great scientists are associated with the study of fluids. Lost are the names of the individuals who, almost 400,000 years ago, carved “aerodynamically correct” (Gad-el Hak 1998) wooden spears. Also lost are those who developed boomerangs and fin-stabilized arrows. Among those not lost is Archimedes, the Greek mathematician (287–212 BC), who provided a mathematical expression for the buoyant force on bodies. Earlier, Thales of Miletus (624–546 BC) asked the simple question: What is air and water? His question is profound as it represents a departure from the main, myth-based modes of inquiry at that time. Tokaty ranks Hero of Alexandria as one of the great, early contributors. Hero (c. 10–70) was a Greek scientist and engineer, who left behind writings and drawings that, from today’s perspective, indicate a good grasp of basic fluid mechanics. To make a complete account of individual contributions to our present understanding of fluid dynamics is, of course, impossible. Yet, it is useful to list some of the contributors to the field. We provide a highly subjective “timeline” in Fig. 1. The list is to a large extent focussed on the topics covered in this review, and includes chemists, engineers, mathematicians, philosophers, and physicists. It recognizes those that have contributed to the development of non-relativistic fluids, their relativistic counterparts, multi-fluid versions of both, and exotic phenomena like superfluidity. The list provides context—both historical and scientific—and also serves as an informal table of contents for this survey.

Fig. 1
figure 1

A “timeline” focussed on the topics covered in this review, including chemists, engineers, mathematicians, philosophers, and physicists who have contributed to the development of non-relativistic fluids, their relativistic counterparts, multi-fluid versions of both, and exotic phenomena like superfluidity

Tokaty (1994) discusses the human propensity for destruction when it comes to water resources. Depletion and pollution are the main offenders. He refers to a “Battle of the Fluids” as a struggle between their destruction and protection. His context for this discussion was the Cold War. He rightly points out the failure to protect our water and air resources by the two dominant powers—the USA and USSR. In an ironic twist, modern study of the relativistic properties of fluids has its own “Battle of the Fluids”. A self-gravitating mass can become absolutely unstable and collapse to a black hole, the ultimate destruction of any form of matter.

1.2 Why are fluid models useful?

The Merriam-Webster online dictionaryFootnote 3 defines a fluid as “...a substance (as a liquid or gas) tending to flow or conform to the outline of its container” when taken as a noun and “...having particles that easily move and change their relative position without a separation of the mass and that easily yield to pressure: capable of flowing” when taken as an adjective. The best model of physics is the Standard Model which is ultimately the description of the “substance” that makes up our fluids. The substance of the Standard Model consists of a remarkably small set of elementary particles: leptons, quarks, and the so-called “force” carriers (gauge-vector bosons). Each elementary particle is quantum mechanical, but the Einstein equations require explicit trajectories. Effectively, there is a disconnect between the quantum scale and our classical description of gravity. Moreover, cosmology and neutron stars are (essentially) many particle systems and—even forgetting about quantum mechanics—it is not possible to track each and every “particle” that makes them up, regardless of whether these are elementary (leptons, quarks, etc.) or collections of elementary particles (e.g., individual stars in galaxies and the galaxies themselves in cosmology). The fluid model is such that the inherent quantum mechanical behaviour, and the existence of many particles are averaged over in such a way that it can be implemented consistently in the Einstein equations.

Fig. 2
figure 2

An object with a characteristic size D is modeled as a fluid that contains M fluid elements. From inside the object we magnify a generic fluid element of characteristic size L. In order for the fluid model to work we require \(M \gg N \gg 1\) and \(D \gg L\)

Central to the model is the notion of a “fluid element”, also known as a “fluid particle” or “material particle” (Lautrup 2005). This is an imagined, local “box” that is infinitesimal with respect to the system en masse and yet large enough to contain a large number of particles, N  (e.g., an Avogadro’s number of particles). The idea is illustrated in Fig. 2. We consider an object with characteristic size D that is modeled as a fluid that contains M fluid elements. From inside the object we magnify a generic fluid element of characteristic size L. In order for the fluid model to work we require \(M \gg N \gg 1\) and \(D \gg L\). Strictly speaking, the model has L infinitesimal, \(M \rightarrow \infty \), but with the total number of particles remaining finite. An operational point of view is that discussed by Lautrup in his fine text “Physics of Continuous Matter” (2005). He rightly points out the implicit connection to the intended precision. At some level, any real system will be discrete and no longer represented by a continuum. As long as the scale where the discreteness of matter and fluctuations are important is much smaller than the desired precision, the continuum approximation is valid. The key point is that the fluid model allows us to consider complex dynamical phenomena in terms of a (relatively) small number of variables. We do not have to keep track of individual particles. The connection between the different scales (macroscopic and microscopic) plays a role, but many of the tricky issues are assumed to be “known” (read: encoded in the matter equation of state, the determination of which may be someone else’s “problem”).

The aim of this review is to describe how the fluid model can be used (and understood) in the context of Einstein’s curved spacetime theory for gravity. As will become clear, this necessarily involves attention to detail. For example, we need to consider how the coordinate invariance of General Relativity (with no preferred observers) impacts on (by necessity) observer-dependent notions from thermodynamics and the underlying microphysics. We also need to explore to what extent the dynamics of spacetime enters the problem. This is particularly relevant in the context of numerical simulations of energetic gravitational-wave sources (like merging neutron stars or massive stars collapsing under their own weight). The first step we have to take is natural—we need to consider how a given fluid element moves through spacetime and how this fluid motion enters the Einstein field equations. To some extent, this is a text-book problem with a well-known solution (= the perfect fluid model). However, as we will learn along the way, more realistic matter descriptions (including for example superfluidity, as expected in the core of a mature neutron star, or the elasticity of the star’s crust) require a more sophisticated approach. Nevertheless, the first step we have to take is natural.

The explicit trajectories that enter the Einstein equations are those of the fluid elements, not the much smaller (generally fundamental) particles that are “confined” (on average) to the elements. Hence, when we talk about the fluid velocity, we mean the velocity of fluid elements. In this sense, the use of the phrase “fluid particle” is very apt. For instance, each fluid element traces out a timelike trajectory in spacetime \(x^a(\tau )\), such that the unit tangent vector

$$\begin{aligned} u^a = {dx^a \over d\tau } , \quad \text{ with } \quad u_a u^a = -1 \end{aligned}$$
(1.1)

where \(\tau \) is time measured on a co-moving clock (proper time), provides the four velocity of the particle. The idea is illustrated in Fig. 3.

Fig. 3
figure 3

An illustration of the fibration of spacetime associated with a set of fluid “observers”, each with their own four velocity \(u^a\) and notion of time (the proper time measured on a co-moving clock). In the fluid model, individual worldlines are assigned to specific fluid elements (which involve averages over the large number of constituent particles)

The fundamental variable that enters the fluid equations is the particle flux density, in the following given by \(n^a = n u^a\), where \(n \approx N/L^3\) is the particle number density of the fluid element whose worldline is given by \(u^a\). An object like a neutron star is then modelled as a collection of particle flux density worldlines that continuously fill a portion of spacetime. In fact, we will see later that the relativistic Euler equation is little more than an “integrability” condition that guarantees that this filling (or fibration) of spacetime can be performed.

Equivalently, we may consider the family of three-dimensional hypersurfaces that are pierced by the worldlines at given instants of time, as illustrated later in Fig. 10. The integrability condition in this case guarantees that the family of hypersurfaces continuously fill a portion of spacetime. In this view, a fluid is a so-called three-brane (see Carter 1992 for a general discussion of branes). In fact, the strategy adopted in Sect. 6 to derive the relativistic fluid equations is based on thinking of a fluid as living in a three-dimensional “matter” space (i.e., the left-hand-side of Fig. 10). At first sight, this approach may seem confusing. However, as we will demonstrate, it allows us to develop a versatile framework for complicated systems which (in turn) enables progress on a number of relevant problems in astrophysics and cosmology.

Once we understand how to build a fluid model using the matter space, it is straight-forward to extend the technique to single fluids with several constituents, as in Sect. 8.1, and multiple fluid systems, as in Sect. 9. An example of the former would be a fluid with one species of particles at a non-zero temperature, i.e., non-zero entropy, that does not allow for heat conduction relative to the particles. (Of course, entropy still flows through spacetime.) The latter example can be obtained by relaxing the constraint of no heat conduction. In this case the particles and the entropy are both considered to be fluidsFootnote 4 that are dynamically independent, meaning that the entropy will have a four-velocity that is generally different from that of the particles. There is thus an associated collection of fluid elements for the particles and another for the entropy. At each point of spacetime that the system occupies there will be two fluid elements, in other words, there are two matter spaces (cf. Sect. 9). Perhaps the most important consequence of this is that there can be a relative flow of the entropy with respect to the particles. In general, relative flows lead to the so-called entrainment effect, i.e., the momentum of one fluid in a multiple fluid system is in principle a linear combination of all the fluid velocities (Andersson and Comer 2006). The canonical examples of two fluid models with entrainment are superfluid \(\mathrm {He}^4\) (Putterman 1974) at non-zero temperature and a mixture of superfluid \(\mathrm {He}^4\) and \(\mathrm {He}^3\) (Andreev and Bashkin 1975). We will develop a detailed understanding of all these concepts in due course, but as it is important to proceed with care we will first focus on the physics that provide input for the fluid model.

1.3 Notation and conventions

Throughout the article we assume the “MTW” (Misner et al. 1973) conventions. We also generally assume geometrized units \(c=G=1\), unless specifically noted otherwise, and set the Boltzmann constant \(k_B = 1\). A coordinate basis will always be used, with spacetime indices denoted by lowercase Latin letters \(\{a,b,\ldots \}\) etc. that range over \(\{0,1,2,3\}\) (time being the zeroth coordinate), and purely spatial indices denoted by lowercase Latin letters \(\{i,j,\ldots \}\) etc. that range over \(\{1,2,3\}\). Unless otherwise noted, we assume that the Einstein summation convention applies. Finally, we adopt the convention that \(u^{\mathrm {x}}_a=g_{ab} u^b_\mathrm {x}\) where \({\mathrm {x}}\) is a fluid constituent label. These are never summed over when repeated. Also note that, while it is possible to build a chemically covariant formalism (with the \({\mathrm {x}}\) treated on a par with spacetime indices) we will not do so here. Our approach has the “advantage” that the constituent labels can be placed up or down, without this having any particular meaning, which helps keep many of the expressions tidy. We will also regularly have to deal with expressions where more than two of these labels are repeated and this complicates a fully covariant approach.

2 Thermodynamics and equations of state

As fluids consists of many fluid elements—and each fluid element consists of many particles—the state of matter in a given fluid element is (inevitably) determined thermodynamically (Reichl 1984). This means that only a few parameters are tracked as the fluid element evolves. In a typical situation, not all the thermodynamic variables are independent—they are connected through the so-called equation of state. Moreover, the number of independent variables may be reduced if the system has an overall additivity property. As this is a very instructive example, we will illustrate this point in detail.

2.1 Fundamental, or Euler, relation

Consider the standard form of the combined First and Second LawsFootnote 5 for a simple, single-species system:

$$\begin{aligned} d E = T \, d S - p \, d V + \mu \, d N. \end{aligned}$$
(2.1)

This follows because there is an equation of state, meaning that \(E = E(S,V,N)\) where

$$\begin{aligned} T = \left. \frac{\partial E}{\partial S} \right| _{V,N}, \quad \quad p = - \left. \frac{\partial E}{\partial V} \right| _{S,N}, \quad \quad \mu = \left. \frac{\partial E}{\partial N} \right| _{S,V} . \end{aligned}$$
(2.2)

The total energy E, entropy S, volume V, and particle number N are said to be extensive if when S, V, and N are doubled, say, then E will also double. Conversely, the temperature T, pressure p, and chemical potential \(\mu \) are called intensive if they do not change their values when V, N, and S are doubled. This is the additivity property and we will now show why it implies an Euler relation (also known as the “fundamental relation”; Reichl 1984) among the thermodynamic variables. This relation is essential for any effort to connect the microphysics and thermodynamics to the fluid dynamics.

Let a bar represent the change in thermodynamic variables when S, V, and N are all increased by the same amount \(\lambda \), i.e.,

$$\begin{aligned} \overline{S} = \lambda S, \qquad \overline{V} = \lambda V, \qquad \overline{N} = \lambda N. \end{aligned}$$
(2.3)

Taking E to be extensiveFootnote 6 then means

$$\begin{aligned} \overline{E}(\overline{S},\overline{V},\overline{N}) = \lambda E(S,V,N). \end{aligned}$$
(2.4)

Of course, we have for the intensive variables

$$\begin{aligned} \overline{T} = T, \qquad \overline{p} = p, \qquad \overline{\mu } = \mu . \end{aligned}$$
(2.5)

Now,

$$\begin{aligned} d \overline{E}= & {} \lambda \, d E + E \, d \lambda = \overline{T} \, d \overline{S} - \overline{p} \, d \overline{V} + \overline{\mu } \, d \overline{N} \nonumber \\= & {} \lambda \left( T d S - p d V + \mu d N \right) + \left( T S - p V + \mu N \right) d \lambda , \end{aligned}$$
(2.6)

and (since the change in the energy should be proportional to \(\lambda \)) we find the Euler relation

$$\begin{aligned} E = T S - p V + \mu N. \end{aligned}$$
(2.7)

If we let \(\varepsilon = E / V\) denote the total energy density, \(s = S / V\) the total entropy density, and \(n = N / V\) the total particle number density, then

$$\begin{aligned} p + \varepsilon = T s + \mu n. \end{aligned}$$
(2.8)

The nicest feature of an extensive system is that the number of parameters required for a complete specification of the thermodynamic state can be reduced by one, in such a way that only intensive variables remain. To see this, let \(\lambda = 1/V\), in which case

$$\begin{aligned} \overline{S} = s, \qquad \overline{V} = 1, \qquad \overline{N} = n. \end{aligned}$$
(2.9)

The re-scaled energy becomes just the total energy density, i.e., \(\overline{E} = E / V = \varepsilon \), and moreover \(\varepsilon = \varepsilon (s,n)\) since

$$\begin{aligned} \varepsilon = \overline{E}(\overline{S},\overline{V},\overline{N}) = \overline{E}(S/V,1,N/V) = \overline{E}(s,n). \end{aligned}$$
(2.10)

The first law thus becomes

$$\begin{aligned} d \overline{E} = \overline{T} \, d \overline{S} - \overline{p} \, d \overline{V} + \overline{\mu } \, d \overline{N} = T \, d s + \mu \, d n, \end{aligned}$$
(2.11)

or

$$\begin{aligned} d \varepsilon = T \, d s + \mu \, d n. \end{aligned}$$
(2.12)

This implies

$$\begin{aligned} T = \left. \frac{\partial \varepsilon }{\partial s} \right| _n, \qquad \mu = \left. \frac{\partial \varepsilon }{\partial n} \right| _s. \end{aligned}$$
(2.13)

That is, \(\mu \) and T are the chemical potentialsFootnote 7 associated with the particles and entropy, respectively. The Euler relation (2.8) then yields the pressure as

$$\begin{aligned} p = - \varepsilon + s \left. \frac{\partial \varepsilon }{\partial s} \right| _n + n \left. \frac{\partial \varepsilon }{\partial n} \right| _s. \end{aligned}$$
(2.14)

In essence, we can think of a given relation \(\varepsilon (s,n)\) as the equation of state, to be determined in the flat, tangent space at each point of spacetime, or, physically, small enough patches across which the changes in the gravitational field are negligible, but also large enough to contain a large number of particles. For example, for a neutron star, Glendenning (1997) argues that the relative change in the metric over the size of a nucleon with respect to the change over the entire star is about \(10^{- 19}\), and thus one must consider many inter-nucleon spacings before a substantial change in the metric occurs. In other words, it is sufficient to determine the properties of matter in special relativity, neglecting effects due to the spacetime curvature.Footnote 8 The equation of state is the key link between the microphysics that governs the local fluid behaviour and global quantities (such as the mass and radius of a star).

In what follows we will use a thermodynamic formulation that satisfies the fundamental scaling relation, meaning that the local thermodynamic state (modulo entrainment, see later) is a function of the variables N/V, S/V, and so on. This is in contrast to the discussion in, for example, “MTW” (Misner et al. 1973). In their approach one fixes from the outset the total number of particles N, meaning that one simply sets \(d N = 0\) in the first law of thermodynamics. Thus, without imposing any scaling relation, one can write

$$\begin{aligned} d \varepsilon = d \left( E/V \right) = T \, d s + \frac{1}{n} \left( p + \varepsilon - T s \right) d n. \end{aligned}$$
(2.15)

This is consistent with our starting point, because we assume that the extensive variables associated with a fluid element do not change as the fluid element moves through spacetime. However, we feel that the scaling is necessary in that the fully conservative (read: non-dissipative) fluid formalism presented below can be adapted to non-conservative, or dissipative, situations where \(d N = 0\) cannot be imposed.

2.2 Case study: neutron stars

With a mass of more than that of the Sun squeezed inside a radius of about 10 km, a neutron star represents many extremes of physics. The relevant matter description involves issues that cannot be explored in terrestrial laboratories, yet relies on aspects similar to those probed by high-energy colliders. However, while the LHC at CERN and RHIC at Brookhaven (among others) probe low density matter at high temperatures, neutron stars are cold (on the nuclear physics temperature scale) and reach significantly higher densities. In effect, the problems are complementary, see Fig. 4 for a schematic illustration. Moreover, atrophysical modelling of neutron star dynamics (e.g., the global oscillations of the star) typically involves large enough scales that a fluid description is an absolute necessity. Yet, such models must build on appropriate microphysics input (encoded in the equation of state). This is problematic because first principle calculations of the interactions for many-body QCD systems are not yet within reach (due to the fermion sign problem). In essence, we do not know the composition of matter. There may be a large population of hyperons present at densities relevant for neutron star cores. Perhaps the quarks are deconfined to form a quark-gluon plasma? Our models needs to be flexible enough to account for different possibilities, and the problem is further complicated by the state of matter. At the relevant temperatures, many of the particle constituents (neutrons, protons, hyperons, etc.) are expected to exhibit Cooper pairing to form superfluid/superconducting condensates. This brings in aspects from low-temperature physics and a realistic neutron-star model must recognize this. In short, the problem is overwhelming and one would typically (at some point) have to resort to phenomenology, using experiments and observations to test predictions as new models become available (Watts et al. 2016).

Fig. 4
figure 4

A broad-brush illustration of the phase space for dense matter physics, represented by the baryon chemical potential (\(\mu _{\mathrm b}\)) (horizontal axis) and the temperature (vertical axis). Experiments carried out using high-energy colliders, like the LHC and RHIC, aim to explore the nature of the quark-gluon plasma and the conditions of the early Universe—hot matter at relatively low densities. In contrast, an understanding of relativistic stars depends on the dense-low temperature regime, which unlikely to be within reach of laboratory efforts. First principles calculation in the \(\mu _{\mathrm b}\rightarrow \infty \) limit of QCD suggests that the core of a mature neutron star may contain a colour superconductor, but the exact nature of the quark pairing at the relevant densitites is not (particularly) well understood (Alford et al. 2008)

The details may be blurry but (at least) the rules that guide the exercise are fairly clear. We need to build models that allow for a complex matter composition and account for different states of matter (from solids to superfluids). This involves going beyond the single-fluid setting and considering systems with distinct components exhibiting relative flows. In short, we need to model multi-constituent multi-fluid systems. As both concepts will be central to the discussion, let us introduce the main ideas already at this point.

It is natural to start by considering the matter in the outer core of a neutron star, dominated by neutrons with a small fraction of protons and electrons. Assuming that the different constituents flow together (we will relax this assumption later), we have the thermodynamic relation (assuming matter at zero temperature, for simplicity)

$$\begin{aligned} p+\varepsilon = \sum _{{\mathrm {x}}} n_{\mathrm {x}}\mu _{\mathrm {x}}, \quad \text{ with } \quad {\mathrm {x}}= \mathrm {n},\mathrm {p},\mathrm{e}, \end{aligned}$$
(2.16)

where \(n_{\mathrm {x}}\) are the respective number densities and \(\mu _{\mathrm {x}}\) the corresponding chemical potentials. This is a straightforward extension of (2.14). At the microscopic scale (e.g., the level of the equation of state), it is usually assumed that the matter is charge neutral. The number of electrons must balance the number of protons. We have \(n_\mathrm {p}=n_\mathrm{e}\) and it follows that

$$\begin{aligned} p+\varepsilon = n_\mathrm {n}\mu _\mathrm {n}+ n_\mathrm {p}(\mu _\mathrm {p}+\mu _\mathrm{e}) \end{aligned}$$
(2.17)

Next, we need to consider the issue of chemical equilibrium. For the case under consideration this would involve the system being such that the Urca reactions are in balance. In essence, this means that we have

$$\begin{aligned} \beta \equiv \mu _\mathrm {n}- (\mu _\mathrm {p}+ \mu _\mathrm{e}) = 0 . \end{aligned}$$
(2.18)

This condition determines how many neutrons we need per proton, which means that the composition is specified. In general, we can rewrite the thermodynamical relation asFootnote 9

$$\begin{aligned} p+\varepsilon = n \mu _\mathrm {n}- n_\mathrm {p}\beta , \end{aligned}$$
(2.19)

where we have introduced the baryon number density \(n= n_\mathrm {n}+n_\mathrm {p}\). Assuming equilibrium, this leads to

$$\begin{aligned} p = n \mu _\mathrm {n}(n) - \varepsilon (n) ; \end{aligned}$$
(2.20)

that is, we have a one-parameter equation of state. It is common to think of the equation of state in this way—the pressure is provided as a function of the (baryon number) density.

Many formulations for numerical simulations take this “barotropic” model as the starting point. The usual logic works (in some sense) “backwards” by focussing on the mass density and separating out the mass density contribution to the chemical potential by introducing \(\rho = mn\) where m is the baryon mass. That is, we use

$$\begin{aligned} \mu _\mathrm {n}= m+ \overline{\mu }. \end{aligned}$$
(2.21)

This expression reflects that simple fact that the (rest) mass of a particle in isolation should be \(mc^2\), leaving the (to some extent) unknown aspects of the many-body interactions to be encoded in \(\overline{\mu }\). This allows us to write

$$\begin{aligned} p = \rho + (n\overline{\mu }- \varepsilon ) = \rho (1- \epsilon ), \end{aligned}$$
(2.22)

where \(\epsilon \) represents the (specific) internal energy. Numerical efforts often focus on \(\epsilon \). The reason for this will become shortly. First, it is easy to see that we also have

$$\begin{aligned} \varepsilon = \rho (1+\epsilon ) , \end{aligned}$$
(2.23)

since

$$\begin{aligned} \overline{\mu }= {d (\rho \epsilon ) \over dn}. \end{aligned}$$
(2.24)

It is also useful to note that

$$\begin{aligned} d\epsilon = {p\over \rho ^2} d\rho . \end{aligned}$$
(2.25)

Let us now see what happens when we try to account for additional aspects, like the effects due to a finite temperature. Assuming that we are comfortable working with the chemical potential (as we will do throughout much of this review) the natural starting point would be (2.12). However, it could be that we would prefer to extend the discussion using the internal energy. In that case, we first of all need to convince ourselves that (2.22) and (2.23) remain valid when \(\varepsilon = \varepsilon (n,s)\). We then have \(\epsilon =\epsilon (\rho ,s)\), which leads to

$$\begin{aligned} {\partial \epsilon \over \partial s} = {T \over \rho } \end{aligned}$$
(2.26)

and we find that

$$\begin{aligned} d \epsilon = {p\over \rho ^2} d\rho + {T\over \rho } ds - {sT \over \rho ^2} d \rho = {p\over \rho ^2} d\rho + T d \hat{s} \end{aligned}$$
(2.27)

where we have introduced the specific entropy

$$\begin{aligned} \hat{s} = {s\over \rho } . \end{aligned}$$
(2.28)

If we want to progress beyond this point, we need to provide the form for the internal energy. This requires a finite temperature treatment on the microphysics level, as discussed by (for example) Constantinou et al. (2015) and Lattimer and Prakash (2016).

Before we move on, it is useful to note that many numerical simulations have been based on implementing a pragmatic result drawn from the ideal gas law

$$\begin{aligned} p = n k_B T , \end{aligned}$$
(2.29)

where \(k_B\) is Boltzmann’s constant. Noting that this model leads to \(\epsilon = C_v T\), with \(C_v\) the specific heat capacity (at fixed volume), Mayer’s relation

$$\begin{aligned} {k_B\over C_v} = m(\varGamma -1), \end{aligned}$$
(2.30)

where \(\varGamma \) is the adiabatic index, leads to

$$\begin{aligned} p = \rho \epsilon (\varGamma -1). \end{aligned}$$
(2.31)

For obvious reasons this is commonly referred to as the Gamma-law equation of state. It may not be particularly realistic—at least not for neutron stars—but it is simple (and relatively easy to implement). It also provides a straightforward measure of the temperature. Combining (2.29) and (2.31) we arrive at

$$\begin{aligned} T = {m\epsilon \over k_B} (\varGamma -1) = {m\over k_B} {p\over \rho }. \end{aligned}$$
(2.32)

This is useful, but we need to be careful with this result. In a more general setting—like a multi-constituent system for which the ideal gas law argument is dubious—we are not quantifying the actual temperature. This would require use of the relevant physics from the beginning of the argument rather than at the end. However, sometimes you have to accept a bit of pragmatism as the price of progress.

Up to this point, we have separated the microphysics (determining the equation of state) from the hydrodynamics (governing stellar oscillations and the like). Let us now consider the scale associated with fluid dynamics. For ordinary matter, the relevant scale is set by interparticle collisions. Collisions tend to dissipate relative motion, leading to the system reaching (local dynamical and thermodynamical) equilibrium. Since we want to associate a single “velocity” with each fluid element, the particles must be able to equilibrate in a meaningful sense (e.g., have a velocity distribution with a well defined peak, allowing us to average over the system). The relevant length-scale is the mean-free path. This concept is closely related to the shear viscosity of matter (which arises due to particle scattering). In the case of neutrons (which dominate the outer core of a typical neutron star) we would have

$$\begin{aligned} \lambda \approx { \eta \over \rho v_F} \approx 10^{-4} \left( {\rho \over 10^{14}\ \text{ g/cm}^3} \right) ^{11/12} \left( {10^8 \ \text{ K } \over T}\right) ^2 \ \text{ cm } , \end{aligned}$$
(2.33)

where \(v_F\) is the relevant Fermi velocity and we have used the estimate for the neutron-neutron scattering shear viscosity \(\eta \) from Andersson et al. (2005). This estimate gives us an idea of the smallest scale on which it makes sense to consider the system as a fluid. Notably, the mean-free path is many orders of magnitude larger than the interparticle separation (typically, the Fermi scale). The actual scale assumed in a fluid model typically depends on the problem one wants to study and tends to be limited by computational resources. For example, in current state of the art simulations of neutron star mergers, the computational fluid elements tend to be of order a few tens to perhaps a hundred meters across. They are in no sense microscopic entities. It is important to appreciate that these models involve a significant amount of “extrapolation”.

Assuming that the averaging procedure makes sense (we will have more to say about this later), the equations of hydrodynamics can be obtained from a set of (more or less) phenomenological balance laws representing the conservation (or not...) of the key quantities. The possibility that different fluid components may be able to flow (or perhaps rather “drift”) relative to one another, leads to a multi-fluid system. In order to model such systems we assume that the system contains a number of distinguishable components, the dynamics of which are coupled. The formalism that we will develop draws on experience from chemistry, where one regularly has to consider the mechanics of mixtures, but is adapted to the kind of systems that are relevant for General Relativity. The archetypal such system is (again) represented by the neutron star core, where we expect different components (neutrons, protons, hyperons) to be in a superfluid state. However, the formalism is general enough that it can be applied in a variety of contexts, including (as we shall see later) the problem of heat conduction and the charged flows relevant for electromagnetism.

As the concept may not be familiar, it is worth considering the notion of a multi-fluid system in a bit more detail before we move on. In principle, it is easy to see how such a system may arise. Recall the discussion of the mean-free path, but consider a system with two distinct particle species. Suppose that the mean-free path associated with scattering of particles of the same kind is (for some reason) significantly shorter than the scale for inter-species collisions. Then we have two clearly defined “fluids”. In fact, any system where it is meaningful to consider one component drifting (on average) relative to another one can be considered from this point of view (a liquid with gas bubbles would be an obvious example).

Another relevant context involves systems that exhibit superfluidity. At the most basic level, superfluidity implies that no friction impedes the flow. Technically, the previous argument leading to a scale for averaging does not work anymore. However, a superfluid system has a different scale associated with it; the so-called coherence length. The coherence length arises from the fact that a superfluid is a “macroscopic” quantum state, the flow of which depends on the gradient of the phase of the wave-function (the so-called order parameter, see Sect. 13.1). On some small scale, the superfluidity breaks down due to quantum fluctations. This defines the coherence length. It can be taken as the typical “size” of a Cooper pair in a fermionic system. On any larger scale the system exhibits collective (fluid) behaviour.

For neutron-star superfluids, the coherence length is of the order of tens of Fermi; evidently, much smaller than the mean-free path in the normal fluid case. This means that superfluids can exhibit extremely small scale dynamics. Since a superfluid is inviscid, superfluid neutrons and superconducting protons (say) do not scatter (at least not at as long as thermal excitations can be ignored) and hence the outer core of a neutron star demands a multi-fluid treatment (Glampedakis et al. 2011). One can meaningfully take the fluid elements to have a size of the order of the coherence length, i.e. they are tiny. However, in reality the problem is more complicated, as yet another length-scale needs to be considered. First of all, on scales larger than the Debye screening length, the electrons will be electromagnetically locked to the protons, forming a charge-neutral conglomerate that does exhibit friction (due to electron-electron scattering). This brings us back to the mean-free path argument. At finite temperatures we also need to consider thermal excitations for both neutrons and protons (which may scatter and dissipate), making the problem rather complex. Finally, ideal superfluids are irrotational and neutron stars are not. In order to mimic bulk rotation the neutron superfluid must form a dense array of vortices (locally breaking the superfluidity). This brings yet another length scale into the picture. In order to develop a useful fluid model, we need to average over the vortices, as well. This makes the effective fluid elements much larger. The typical vortex spacing in a neutron star is of the order;

$$\begin{aligned} d_\mathrm {n}\approx 4\times 10^{-4} \left( {P \over 1\ \text{ ms }} \right) ^{1/2} \ \text{ cm } , \end{aligned}$$
(2.34)

where P is the star’s spin period. In other words, the fluid elements we consider may (at the end of the day) be quite large also in a superfluid system.

3 Physics in a curved spacetime

There is an extensive literature on Special and General Relativity and the spacetime-based viewFootnote 10 of the laws of physics, providing historical context, technical insight and topical updates. For a student at any level interested in developing a working understanding we recommend Taylor and Wheeler (1992) for an introduction, followed by Hartle’s excellent text (2003) designed for students at the undergraduate level. The recent contribution from Poisson and Will (2014) provides a detailed discussion of the link between Newtonian gravity and Einstein’s four dimensional picture. For more advanced students, we suggest two of the classics, “MTW” (Misner et al. 1973) and Weinberg (1972), or the more contemporary book by Wald (1984). Finally, let us not forget the Living Reviews journal as a premier online source of up-to-date information!

In terms of the experimental and/or observational support for Special and General Relativity, we recommend two articles by Will that were written for the 2005 World Year of Physics celebration (2005, 2006). They summarize a variety of tests that have been designed to expose breakdowns in both theories. (We also recommend Will’s popular book Was Einstein Right? (1986) and his technical exposition Theory and Experiment in Gravitational Physics (1993).) Updates including the breakthrough observations of gravitational waves can be found in recent monographs (Maggiore 2018; Andersson 2019) . There have been significant recent developments, but... to date, Einstein’s theoretical edifice is still standing!

For Special Relativity, this is not surprising, given its long list of successes: explanation of the Michelson–Morley result, the prediction and subsequent discovery of anti-matter, and the standard model of particle physics, to name a few. Will (2006) offers the observation that genetic mutations via cosmic rays require Special Relativity, since otherwise muons would decay before making it to the surface of the Earth. On a more somber note, we may consider the Trinity site in New Mexico, and the tragedies of Hiroshima and Nagasaki, as reminders of \(E = m c^2\).

In support of General Relativity, there are Eötvös-type experiments testing the equivalence of inertial and gravitational mass, detection of gravitational red-shifts of photons, the passing of the solar system tests, confirmation of energy loss via gravitational radiation in the Hulse–Taylor binary pulsar—and eventually the first direct detection of these faint whispers from the Universe in 2015—and the expansion of the Universe. Incredibly, General Relativity even finds a practical application in the GPS system. In fact, we need both of Einstein’s theories. The speed of the moving clock leads to it slowing down by 7 micro-seconds every day, while the fact that a clock in a gravitational field runs slow, leads to the orbiting clock appearing to speed up by 45 micro-seconds each day. All in all, if we ignore relativity position errors accumulate at a rate of about 10 km every day (Will 2006). This would make reliable navigation impossible.

The evidence is overwhelming that General Relativity, or at least some closely related theory that passes the entire collection of tests, is the proper description of gravity. Given this, we assume the Einstein Equivalence Principle, i.e., that (Will 2006, 2005, 1993)

  • test bodies fall with the same acceleration independently of their internal structure or composition;

  • the outcome of any local non-gravitational experiment is independent of the velocity of the freely-falling reference frame in which it is performed;

  • the outcome of any local non-gravitational experiment is independent of where and when in the Universe it is performed.

If the Equivalence Principle holds, then gravitation must be described by a metric-based theory (Will 2006). This means that

  1. 1.

    spacetime is endowed with a symmetric metric,

  2. 2.

    the trajectories of freely falling bodies are geodesics of that metric, and

  3. 3.

    in local freely falling reference frames, the non-gravitational laws of physics are those of Special Relativity.

For our present purposes this is very good news. The availability of a metricFootnote 11 means that we can develop the theory without requiring much of the differential geometry edifice that would be needed in a more general case. We will develop the description of relativistic fluids with this in mind. Readers that find our approach too “pedestrian” may want to consult the article by Gourgoulhon (2006), which serves as a useful complement to our description.

3.1 The metric and spacetime curvature

Our strategy is to provide a “working understanding” of the mathematical objects that enter the Einstein equations of General Relativity. We assume that the metric is the fundamental “field” of gravity. For a four-dimensional spacetime the metric determines the distance between two spacetime points along a given curve, which can generally be written as a one parameter function with, say, components \(x^a(\tau )\). For a material body, it is natural to take the parameter to be proper time, but we may opt to make a different choice. As we will see, once a notion of parallel transport is established, the metric also encodes information about the curvature of spacetime, which is taken to be pseudo-Riemannian, meaning that the signatureFootnote 12 of the metric is \(-+++\) (cf. Eq. (3.2) below).

In a coordinate basis, which we will assume throughout this review, the metric is denoted by \(g_{a b} = g_{b a}\). The symmetry implies that there are in general ten independent components (modulo the freedom to set arbitrarily four components that is inherited from coordinate transformations; cf. Eqs. (3.8) and (3.9) below). The spacetime version of the Pythagorean theorem takes the form

$$\begin{aligned} d s^2 = g_{a b} \, d x^a \, d x^b , \end{aligned}$$
(3.1)

and in a local set of Minkowski coordinates \(\{t,x,y,z\}\) (i.e., in a local inertial frame, or small patch of the manifold) it looks like

$$\begin{aligned} d s^2 = - \left( d t \right) ^2 + \left( d x \right) ^2 + \left( dy \right) ^2 + \left( dz \right) ^2. \end{aligned}$$
(3.2)

This illustrates the \(-+++\) signature. The inverse metric \(g^{a b}\) is such that

$$\begin{aligned} g^{a c} g_{c b} = \delta ^a{}_b, \end{aligned}$$
(3.3)

where \(\delta ^a{}_b\) is the unit tensor. The metric is also used to raise and lower spacetime indices, i.e., if we let \(V^a\) denote a contravariant vector, then its associated covariant vector (also known as a covector or one-form) \(V_a\) is obtained as

$$\begin{aligned} V_a = g_{a b} V^b \qquad \Leftrightarrow \qquad V^a = g^{a b} V_b . \end{aligned}$$
(3.4)

We can now consider three different classes of curves: timelike, null, and spacelike. A vector is said to be timelike if \(g_{a b} V^a V^b < 0\), null if \(g_{a b} V^a V^b = 0\), and spacelike if \(g_{a b} V^a V^b > 0\). We can naturally define timelike, null, and spacelike curves in terms of the congruence of tangent vectors that they generate. A particularly useful timelike curve for fluids is one that is parameterized by the so-called proper time, i.e., \(x^a(\tau )\) where

$$\begin{aligned} d \tau ^2 = - d s^2. \end{aligned}$$
(3.5)

The tangent \(u^a\) to such a curve has unit magnitude; specifically,

$$\begin{aligned} u^a \equiv \frac{dx^a}{d\tau }, \end{aligned}$$
(3.6)

and thus

$$\begin{aligned} g_{a b} u^a u^b = g_{a b} \frac{d x^a}{d \tau } \frac{d x^b}{d\tau } = \frac{d s^2}{d \tau ^2} = - 1. \end{aligned}$$
(3.7)

Under a coordinate transformation \(x^a \rightarrow \overline{x}^a\), contravariant vectors transform as

$$\begin{aligned} \overline{V}^a = \frac{\partial \overline{x}^a}{\partial x^b} V^b \end{aligned}$$
(3.8)

and covariant vectors as

$$\begin{aligned} \overline{V}_a = \frac{\partial x^b}{\partial \overline{x}^a} V_b . \end{aligned}$$
(3.9)

Tensors with a greater rank (i.e., a greater number of indices), transform similarly by acting linearly on each index using the above two rules.

When integrating, as we have to when we discuss conservation laws for fluids, we must make use of an appropriate measure that ensures the coordinate invariance of the integration. In the context of three-dimensional Euclidean space this measure is referred to as the Jacobian. For spacetime, we use the so-called volume form \(\epsilon _{abcd}\). It is completely antisymmetric, and for four-dimensional spacetime, it has only one independent component, which is

$$\begin{aligned} \epsilon _{0 1 2 3} = \sqrt{- g} \qquad \text{ and } \qquad \epsilon ^{0 1 2 3} = \frac{1}{\sqrt{- g}}, \end{aligned}$$
(3.10)

where g is the determinant of the metric (cf. Appendix 1 for details). The minus sign is required under the square root because of the metric signature. By contrast, for three-dimensional Euclidean space (i.e., when considering the fluid equations in the Newtonian limit) we have

$$\begin{aligned} \epsilon _{1 2 3} = \sqrt{g} \qquad \text{ and } \qquad \epsilon ^{1 2 3} = \frac{1}{\sqrt{g}}, \end{aligned}$$
(3.11)

but now g is the determinant of the three-dimensional space metric. A general identity that is extremely useful for writing the fluid vorticity in three-dimensional, Euclidean space—using lower-case Latin indices and setting \(s = 0\), \(n = 3\) and \(j = 1\) in Eq. (A.2) of Appendix 1—is

$$\begin{aligned} \epsilon ^{m i j} \epsilon _{m k l} = \delta ^i{}_k \delta ^j{}_l - \delta ^j{}_k \delta ^i{}_l. \end{aligned}$$
(3.12)

The general identities in Eqs. (A.1A.3) of Appendix 1 will be frequently used in the following.

3.2 Parallel transport and the covariant derivative

In order to have a generally covariant prescription for fluids—in terms of spacetime tensors—we must have a notion of derivative \(\nabla _a\) that is itself covariant. For example, when \(\nabla _a\) acts on a vector \(V^a\) a rank-two tensor of mixed indices must result:

$$\begin{aligned} \overline{\nabla }_b \overline{V}^a = \frac{\partial x^c}{\partial \overline{x}^b} \frac{\partial \overline{x}^a}{\partial x^d} \nabla _c V^d . \end{aligned}$$
(3.13)

The ordinary partial derivative does not work because under a general coordinate transformation

$$\begin{aligned} \frac{\partial \overline{V}^a}{\partial \overline{x}^b} = \frac{\partial x^c}{\partial \overline{x}^b} \frac{\partial \overline{x}^a}{\partial x^d} \frac{\partial V^d}{\partial x^c} + \frac{\partial x^c}{\partial \overline{x}^b} \frac{\partial ^2 \overline{x}^a}{\partial x^c \partial x^d} V^d . \end{aligned}$$
(3.14)

The second term spoils the general covariance, since it vanishes only for the restricted set of rectilinear transformations

$$\begin{aligned} \overline{x}^a = a^a{}_b x^b + b^a , \end{aligned}$$
(3.15)

where \(a^a{}_b\) and \(b^a\) are constants. Note that this includes the Lorentz transformation of Special Relativity.

For both physical and mathematical reasons, one expects a covariant derivative to be defined in terms of a limit. This is, however, a bit problematic. In three-dimensional Euclidean space limits can be defined uniquely as vectors can be moved around without their length and direction changing, for instance, via the use of Cartesian coordinates (the \(\{{\varvec{i}},{\varvec{j}},{\varvec{k}}\}\) set of basis vectors) and the usual dot product. Given these limits, those corresponding to more general curvilinear coordinates can be established. The same is not true for curved spaces and/or spacetimes because they do not have an a priori notion of parallel transport.

Consider the classic example of a vector on the surface of a sphere (illustrated in Fig. 5). Take this vector and move it along some great circle from the equator to the North pole in such a way as to always keep the vector pointing along the circle. Pick a different great circle, and without allowing the vector to rotate, by forcing it to maintain the same angle with the locally straight portion of the great circle that it happens to be on, move it back to the equator. Finally, move the vector in a similar way along the equator until it gets back to its starting point. The vector’s spatial orientation will be different from its original direction, and the difference is directly related to the particular path that the vector followed.

On the other hand, we could consider the sphere to be embedded in a three-dimensional Euclidean space, and let the two-dimensional vector on the sphere result from projection of a three-dimensional vector. Then we move the projection so that its higher-dimensional counterpart always maintains the same orientation with respect to its original direction in the embedding space. When the projection returns to its starting place it will have exactly the same orientation as it started out with (see Fig. 5). It is now clear that a derivative operation that depends on comparing a vector at one point to that of a nearby point is not unique, because it depends on the choice of parallel transport.

Pauli (1981) notes that Levi-Civita (1917) is the first to have formulated the concept of parallel “displacement”, with Weyl (1952) generalizing it to manifolds that do not have a metric. The point of view expounded in the books of Weyl and Pauli is that parallel transport is best defined as a mapping of the “totality of all vectors” that “originate” at one point of a manifold with the totality at another point. (In modern texts, this discussion tends to be based on fiber bundles.) Pauli points out that we cannot simply require equality of vector components as the mapping.

Let us examine the parallel transport of the force-free, point particle velocity in Euclidean three-dimensional space as a means for motivating the form of the mapping. As the velocity is constant, we know that the curve traced out by the particle will be a straight line. In fact, we can turn this around and say that the velocity parallel transports itself because the path traced out is a geodesic (i.e., the straightest possible curve allowed by Euclidean space). In our analysis we will borrow liberally from the excellent discussion of Lovelock and Rund (1989). Their text is comprehensive yet readable for anyone not well-versed with differential geometry. Finally, we note that this analysis will be relevant later when we consider the Newtonian limit of the relativistic equations, in an arbitrary coordinate basis.

Fig. 5
figure 5

A schematic illustration of two possible versions of parallel transport. In the first case (a) a vector is transported along great circles on the sphere locally maintaining the same angle with the path. If the contour is closed, the final orientation of the vector will differ from the original one. In case (b) the sphere is considered to be embedded in a three-dimensional Euclidean space, and the vector on the sphere results from projection. In this case, the vector returns to the original orientation for a closed contour

We are all well aware that the points on the curve traced out by the particle can be described, in Cartesian coordinates, by three functions \(x^i(t)\) where t is the universal Newtonian time. Likewise, we know that the tangent vector at each point of the curve is given by the velocity components \(v^i(t) = d x^i/d t\), and that the force-free condition is equivalent to

$$\begin{aligned} a^i(t) = \frac{dv^i}{d t} = 0 \qquad \Rightarrow \qquad v^i(t) = \mathrm {const}. \end{aligned}$$
(3.16)

Hence, the velocity components \(v^i(0)\) at the point \(x^i(0)\) are equal to those at any other point along the curve, say \(v^i(T)\) at \(x^i(T)\), and so we could simply take \(v^i(0) = v^i(T)\) as the mapping. But as Pauli warns, we only need to reconsider this example using spherical coordinates to see that the velocity components \(\{\dot{r},\dot{\theta },\dot{\phi }\}\) must change as they undergo parallel transport along a straight-line path (assuming the particle does not pass through the origin). The question is what should be used in place of component equality? The answer follows once we find a curvilinear coordinate version of \(dv^i/dt = 0\).

What we need is a new “time” derivative \(\overline{D}/d t\), that yields a generally covariant statement

$$\begin{aligned} \frac{\overline{D} \overline{v}^i}{d t} = 0, \end{aligned}$$
(3.17)

where the \(\overline{v}^i(t) = d \overline{x}^i/d t\) are the velocity components in a curvilinear system of coordinates. Consider now a coordinate transformation to the new coordinate system \(\overline{x}^i\), the inverse being \(x^i = x^i(\overline{x}^j)\). Given that

$$\begin{aligned} v^i = \frac{\partial x^i}{\partial \overline{x}^j} \overline{v}^j \end{aligned}$$
(3.18)

we can write

$$\begin{aligned} \frac{d v^i}{d t} = \left( \frac{\partial x^i}{\partial \overline{x}^j} \frac{\partial \overline{v}^j}{\partial \overline{x}^k} + \frac{\partial ^2 x^i}{\partial \overline{x}^k \partial \overline{x}^j} \overline{v}^j \right) \overline{v}^k, \end{aligned}$$
(3.19)

where

$$\begin{aligned} \frac{d \overline{v}^i}{d t} = \frac{\partial \overline{v}^i}{\partial \overline{x}^j} \overline{v}^j. \end{aligned}$$
(3.20)

Again, we have an “offending” term that vanishes only for rectilinear coordinate transformations. However, we are now in a position to show the importance of this term to the definition of the covariant derivative.

First note that the metric \(\overline{g}_{i j}\) for our curvilinear coordinate system is obtained from

$$\begin{aligned} \overline{g}_{i j} = \frac{\partial x^k}{\partial \overline{x}^i} \frac{\partial x^l}{\partial \overline{x}^j} \delta _{k l}, \end{aligned}$$
(3.21)

where

$$\begin{aligned} \delta _{i j} = \left\{ \begin{array}{ll} 1 &{} \qquad \mathrm {for }\, i = j, \\ 0 &{} \qquad \mathrm {for }\, i \ne j. \end{array} \right. \end{aligned}$$
(3.22)

Differentiating Eq. (3.21) with respect to \(\overline{x}\), and permutating indices, we can show that

$$\begin{aligned} \frac{\partial ^2 x^h}{\partial \overline{x}^i \partial \overline{x}^j} \frac{\partial x^l}{\partial \overline{x}^k} \delta _{h l} = \frac{1}{2} \left( \overline{g}_{i k,j} + \overline{g}_{j k,i} - \overline{g}_{i j,k} \right) \equiv \overline{g}_{i l} \overline{ \left\{ \scriptstyle {\begin{array}{c} l \\ j~k \end{array}} \right\} }, \end{aligned}$$
(3.23)

where we use commas to indicate partial derivatives:

$$\begin{aligned} \overline{g}_{i j, k} \equiv \frac{\partial \overline{g}_{i j}}{\partial \overline{x}^k}. \end{aligned}$$
(3.24)

Using the inverse transformation of \(\overline{g}_{i j}\) to \(\delta _{i j}\) implied by Eq. (3.21), and the fact that

$$\begin{aligned} \delta ^i{}_j = \frac{\partial \overline{x}^k}{\partial x^j} \frac{\partial x^i}{\partial \overline{x}^k}, \end{aligned}$$
(3.25)

we get

$$\begin{aligned} \frac{\partial ^2 x^i}{\partial \overline{x}^j \partial \overline{x}^k} = \overline{\left\{ \scriptstyle {\begin{array}{c} l \\ j~k \end{array}}\right\} } \frac{\partial x^i}{\partial \overline{x}^l}. \end{aligned}$$
(3.26)

Now we substitute Eq. (3.26) into Eq. (3.19) and find

$$\begin{aligned} \frac{d v^i}{d t} = \frac{\partial x^i}{\partial \overline{x}^j} \frac{\overline{{D}} \overline{v}^j}{d t}, \end{aligned}$$
(3.27)

where

$$\begin{aligned} \frac{\overline{{D}} \overline{v}^i}{{d} t} = \overline{v}^j \left( \frac{\partial \overline{v}^i}{\partial \overline{x}^j} + \overline{\left\{ \scriptstyle {\begin{array}{c} i \\ k~j \end{array}} \right\} } \overline{v}^k \right) . \end{aligned}$$
(3.28)

The operator \(\overline{{D}}/{d} t\) is easily seen to be covariant with respect to general transformations of curvilinear coordinates.

We now identify the generally covariant derivative (dropping the overline) as

$$\begin{aligned} \nabla _j v^i = \frac{\partial v^i}{\partial x^j} + \left\{ \scriptstyle {\begin{array}{c} i \\ k~j \end{array}}\right\} v^k \equiv v^i{}_{; j}. \end{aligned}$$
(3.29)

Similarly, the covariant derivative of a covector is

$$\begin{aligned} \nabla _j v_i = \frac{\partial v_i}{\partial x^j} - \left\{ \scriptstyle {\begin{array}{c} k \\ i~j \end{array}}\right\} v_k \equiv v_{i ; j}. \end{aligned}$$
(3.30)

One extends the covariant derivative to higher rank tensors by adding to the partial derivative each term that results by acting linearly on each index with \(\left\{ \scriptstyle {\begin{array}{c} i \\ j~k \end{array}}\right\} \) using the two rules given above.

Relying on our understanding of the force-free point particle, we have built a notion of parallel transport that is consistent with our intuition based on equality of components in Cartesian coordinates. We can now expand this intuition to see how the vector components in a curvilinear coordinate system must change under an infinitesimal, parallel displacement from \(x^i(t)\) to \(x^i(t + \delta t)\). Setting Eq. (3.28) to zero, and noting that \(v^i \delta t = \delta x^i\), implies

$$\begin{aligned} \delta v^i \equiv \frac{\partial v^i}{\partial x^j} \delta x^j = - \left\{ \scriptstyle {\begin{array}{c} i \\ k~j \end{array}} \right\} v^k \delta x^j. \end{aligned}$$
(3.31)

In General Relativity we assume that under an infinitesimal parallel transport from a spacetime point \(x^a(\tau )\) on a given curve to a nearby point \(x^a(\tau + \delta \tau )\) on the same curve, the components of a vector \(V^a\) will change in an analogous way, namely

$$\begin{aligned} \delta V^a_\parallel \equiv \frac{\partial V^a}{\partial x^b} \delta x^b = - \varGamma ^a_{c b} V^c \delta x^b , \end{aligned}$$
(3.32)

where

$$\begin{aligned} \delta x^a \equiv \frac{d x^a}{{d} \tau } \delta \tau . \end{aligned}$$
(3.33)

Weyl (1952) refers to the symbol \(\varGamma ^a_{b c}\) as the “components of the affine relationship”, but we will use the modern terminology and call it the connection. In the language of Weyl and Pauli, this is the mapping that we were looking for.

For Euclidean space, we can verify that the metric satisfies

$$\begin{aligned} \nabla _i g_{j k} = 0 \end{aligned}$$
(3.34)

for a general, curvilinear coordinate system. The metric is thus said to be “compatible” with the covariant derivative. Metric compatibility is imposed as an assumption in General Relativity. This results in the so-called Christoffel symbol for the connection, defined as

$$\begin{aligned} \varGamma ^a_{b c} = \frac{1}{2} g^{a d} \left( g_{b d, c} + g_{c d, b} - g_{b c, d}\right) . \end{aligned}$$
(3.35)

The rules for the covariant derivative of a contravariant vector and a covector are the same as in Eqs. (3.29) and (3.30), except that all indices are spacetime ones.

figure a

3.3 The Lie derivative and spacetime symmetries

From the above discussion it should be evident that there are other ways to take derivatives in a curved spacetime. A particularly important tool for measuring changes in tensors from point to point in spacetime is the Lie derivative. It requires a vector field, but no connection, and is a more natural definition in the sense that it does not even require a metric. The Lie derivative yields a tensor of the same type and rank as the tensor on which the derivative operated (unlike the covariant derivative, which increases the rank by one). It is as important for Newtonian, non-relativistic fluids as for relativistic ones (a fact which needs to be continually emphasized as it has not yet permeated the fluid literature for chemists, engineers, and physicists). For instance, the classic papers on the gravitational-wave driven Chandrasekhar–Friedman–Schutz instability (Friedman and Schutz 1978a, b) in rotating stars are great illustrations of the use of the Lie derivative in Newtonian physics. We recommend the book by Schutz (1980) for a complete discussion and derivation of the Lie derivative and its role in Newtonian fluid dynamics (see also the series of papers by Carter and Chamel 2004, 2005a, b). Here, we will adapt the coordinate-based discussion of Schouten (1989), as it may be more readily understood by readers not well-versed in differential geometry.

In a first course on classical mechanics, when students encounter rotations, they are introduced to the idea of active and passive transformations. An active transformation would be to fix the origin and axis-orientations of a given coordinate system with respect to some external observer, and then move an object from one point to another point of the same coordinate system. A passive transformation would be to place an object so that it remains fixed with respect to some external observer, and then induce a rotation of the object with respect to a given coordinate system, rotating the coordinate system itself with respect to the external observer. We will derive the Lie derivative of a vector by first performing an active transformation and then following it with a passive transformation to determine how the final vector differs from its original form. In the language of differential geometry, we will first “push-forward” the vector, and then subject it to a “pull-back”.

figure b

In the active (push-forward) sense we imagine that there are two spacetime points connected by a smooth curve \(x^a(\lambda )\). Let the first point be at \(\lambda = 0\), and the second, nearby point at \(\lambda = \epsilon \), i.e., \(x^a(\epsilon )\); that is,

$$\begin{aligned} x^a_\epsilon \equiv x^a(\epsilon ) \approx x^a_0 + \epsilon \, \xi ^a , \end{aligned}$$
(3.36)

where \(x^a_0 \equiv x^a(0)\) and

$$\begin{aligned} \xi ^a = \left. \frac{dx^a}{{d} \lambda } \right| _{\lambda = 0} \end{aligned}$$
(3.37)

is the tangent to the curve at \(\lambda = 0\). In the passive (pull-back) sense we imagine that the coordinate system itself is changed to \(\overline{x}{}^a =\overline{x}{}^a(x^b)\), but in the very special form

$$\begin{aligned} \overline{x}{}^a = x^a - \epsilon \, \xi ^a . \end{aligned}$$
(3.38)

In this second step the Lie derivative differs from the covariant derivative. If we insert Eq. (3.36) into Eq. (3.38) we find the result \(\overline{x}{}^a_\epsilon = x^a_0\). This is called “Lie-dragging” of the coordinate frame, meaning that the coordinates at \(\lambda = 0\) are carried along so that at \(\lambda = \epsilon \) (and in the new coordinate system) the coordinate labels take the same numerical values.

Fig. 6
figure 6

A schematic illustration of the Lie derivative. The coordinate system is dragged along with the flow, and one can imagine an observer “taking derivatives” as he/she moves with the flow (see the discussion in the text)

As an interesting aside it is worth noting that Arnold (1989)—only a little whimsically—refers to this construction as the “fisherman’s derivative”. He imagines a fisherman sitting in a boat on a river, “taking derivatives” as the boat moves along with the current. Let us now see how Lie-dragging reels in vectors.

For some given vector field that takes values \(V^a(\lambda )\), say, along the curve, we write

$$\begin{aligned} V^a_0 = V^a(0) \end{aligned}$$
(3.39)

for the value of \(V^a\) at \(\lambda = 0\) and

$$\begin{aligned} V^a_\epsilon = V^a(\epsilon ) \end{aligned}$$
(3.40)

for the value at \(\lambda = \epsilon \). Because the two points \(x^a_0\) and \(x^a_\epsilon \) are infinitesimally close (\(\epsilon \ll 1\)) we have

$$\begin{aligned} V^a_\epsilon \approx V^a_0 + \epsilon \, \xi ^b \left. \frac{\partial V^a}{\partial x^b} \right| _{\lambda = 0} \end{aligned}$$
(3.41)

for the value of \(V^a\) at the nearby point and in the same coordinate system. However, in the new coordinate system (at the nearby point) we find

$$\begin{aligned} \overline{V}{}^a_\epsilon = \left. \left( \frac{\partial \overline{x}{}^a}{\partial x^b} V^b\right) \right| _{\lambda = \epsilon } \approx V^a_\epsilon - \epsilon \, V^b_0 \left. \frac{\partial \xi ^a}{\partial x^b} \right| _{\lambda = 0}. \end{aligned}$$
(3.42)

The Lie derivative now is defined to be

$$\begin{aligned} \mathcal{L}_\xi V^a= & {} \lim _{\epsilon \rightarrow 0} \frac{\overline{V}{}^a_\epsilon - V^a}{\epsilon } \nonumber \\= & {} \xi ^b \frac{\partial V^a}{\partial x^b} - V^b \frac{\partial \xi ^a}{\partial x^b} \nonumber \\= & {} \xi ^b \nabla _b V^a - V^b \nabla _b \xi ^a , \end{aligned}$$
(3.43)

where we have dropped the “0” subscript and the last equality follows easily by noting \(\varGamma ^c_{a b} = \varGamma ^c_{b a}\).

The Lie derivative of a covector \(A_a\) is easily obtained by acting on the scalar \(A_a V^a\) for an arbitrary vector \(V^a\):

$$\begin{aligned} \mathcal{L}_\xi (A_a V^a)= & {} V^a \mathcal{L}_\xi A_a + A_a \mathcal{L}_\xi V^a \nonumber \\= & {} V^a \mathcal{L}_\xi A_a + A_a \left( \xi ^b \nabla _b V^a - V^b \nabla _b \xi ^a \right) . \end{aligned}$$
(3.44)

But, because \(A_a V^a\) is a scalar,

$$\begin{aligned} \mathcal{L}_\xi (A_a V^a)= & {} \xi ^b \nabla _b (A_a V^a) \nonumber \\= & {} \xi ^b \left( V^a \nabla _b A_a + A_a \nabla _b V^a \right) , \end{aligned}$$
(3.45)

and thus

$$\begin{aligned} V^a \left( \mathcal{L}_\xi A_a - \xi ^b \nabla _b A_a - A_b \nabla _a \xi ^b \right) = 0. \end{aligned}$$
(3.46)

Since \(V^a\) is arbitrary we have

$$\begin{aligned} \mathcal{L}_\xi A_a = \xi ^b \nabla _b A_a + A_b \nabla _a \xi ^b . \end{aligned}$$
(3.47)

Equation (3.32) introduced the effect of parallel transport on vector components. By contrast, the Lie-dragging of a vector causes its components to change as

$$\begin{aligned} \delta V^a_\mathcal{L} = \mathcal{L}_\xi V^a \, \epsilon . \end{aligned}$$
(3.48)

We see that if \(\mathcal{L}_\xi V^a = 0\), then the components of the vector do not change as the vector is Lie-dragged. Suppose now that \(V^a\) represents a vector field and that there exists a corresponding congruence of curves with tangent given by \(\xi ^a\). If the components of the vector field do not change under Lie-dragging, we can show that this implies a symmetry, meaning that a coordinate system can be found such that the vector components do not depend on one of the coordinates. This is a potentially very powerful statement.

Let \(\xi ^a\) represent the tangent to the curves drawn out by, say, the \(a = \phi \) coordinate. Then we can write \(x^a(\lambda ) = \lambda \) which means

$$\begin{aligned} \xi ^a = \delta ^a{}_\phi . \end{aligned}$$
(3.49)

If the Lie derivative of \(V^a\) with respect to \(\xi ^b\) vanishes we find

$$\begin{aligned} \xi ^b \frac{\partial V^a}{\partial x^b} = V^b \frac{\partial \xi ^a}{\partial x^b} = 0 . \end{aligned}$$
(3.50)

Using this in Eq. (3.41) implies \(V^a_\epsilon = V^a_0\), that is to say, the vector field \(V^a(x^b)\) does not depend on the \(x^a\) coordinate. Generally speaking, every \(\xi ^a\) that exists that causes the Lie derivative of a vector (or higher rank tensors) to vanish represents a symmetry.

Let us take the spacetime metric \(g_{a b}\) as an example. A spacetime symmetry can be represented by a generating vector field \(\xi ^a\) such that

$$\begin{aligned} \mathcal{L}_\xi g_{a b} = \nabla _a \xi _b + \nabla _b \xi _a = 0 . \end{aligned}$$
(3.51)

This is known as Killing’s equation, and solutions to this equation are naturally referred to as Killing vectors. It is now fairly easy to demonstrate the claim that the existence of a Killing vector relates to an underlying symmetry of the spacetime metric. First we expand (3.51) to get

$$\begin{aligned} g_{bc} \partial _a \xi ^c + g_{ac} \partial _b \xi ^c + \xi ^d \partial _d g_{ab} = 0 . \end{aligned}$$
(3.52)

Then we assume that the Killing vector is associated with one of the coordinates, e.g., by letting \(\xi ^a = \delta _0^a\). The first two terms in (3.52) then vanish by definition, and we are left with

$$\begin{aligned} \xi ^d \partial _d g_{ab} = \partial _0 g_{ab} = 0 , \end{aligned}$$
(3.53)

demonstrating that the metric does not depend on the \(x^0\) coordinate.

An important application of this idea is provided by stationary, axisymmetric, and asymptotically flat spacetimes—highly relevant in the present context as they capture the physics of rotating, equilibrium configurations. The associated geometries are fundamental for the relativistic astrophysics of spinning black holes and neutron stars. Stationary, axisymmetric, and asymptotically flat spacetimes are such that (Bonazzola et al. 1993)

  1. 1.

    there exists a Killing vector \(t^a\) that is timelike at spatial infinity, and the independence of the metric on the associated time coordinate leads to the solution being stationary;

  2. 2.

    there exists a Killing vector \(\phi ^a\) that vanishes on a timelike 2-surface—the axis of symmetry—is spacelike everywhere else, and whose orbits are closed curves; and

  3. 3.

    asymptotic flatness means the scalar products \(t_a t^a\), \(\phi _a \phi ^a\), and \(t_a \phi ^a\) tend to, respectively, \(- 1\), \(+ \infty \), and 0 at spatial infinity.

3.4 Spacetime curvature

The main message of the previous two Sects. 3.2 and 3.3 is that one must have an a priori idea of how vectors and higher rank tensors are moved from point to point in spacetime. An immediate manifestation of the complexity associated with carrying tensors about in spacetime is that the covariant derivative does not commute. For a vector we find

$$\begin{aligned} \nabla _b \nabla _c V^a - \nabla _c \nabla _b V^a = R^a{}_{d b c} V^d , \end{aligned}$$
(3.54)

where \(R^a{}_{d b c}\) is the Riemann tensor. It is obtained from

$$\begin{aligned} R^a{}_{d b c} = \varGamma ^a_{d c, b} - \varGamma ^a_{d b, c} + \varGamma ^a_{e b} \varGamma ^e_{d c} - \varGamma ^a_{e c} \varGamma ^e_{d b} . \end{aligned}$$
(3.55)

Closely associated are the Ricci tensor \(R_{ab} = R_{ba}\) and scalar R that are defined by the contractions

$$\begin{aligned} R_{a b} = R^c{}_{a c b} , \qquad R = g^{a b} R_{a b} . \end{aligned}$$
(3.56)

We will also need the Einstein tensor, which is given by

$$\begin{aligned} G_{a b} = R_{a b} - \frac{1}{2} R g_{a b} . \end{aligned}$$
(3.57)

It is such that \(\nabla _b G^b{}_a\) vanishes identically. This is known as the Bianchi identity.

A more intuitive understanding of the Riemann tensor is obtained by seeing how its presence leads to a path-dependence in the changes that a vector experiences as it moves from point to point in spacetime. Such a situation is known as a “non-integrability” condition, because the result depends on the whole path and not just the initial and final points. That is, it is not like a total derivative which can be integrated and depends on only the limits of integration. Geometrically we say that the spacetime is curved, which is why the Riemann tensor is also known as the curvature tensor.

To illustrate the meaning of the curvature tensor, let us suppose that we are given a surface that is parameterized by the two parameters \(\lambda \) and \(\eta \). Points that live on this surface will have coordinate labels \(x^a(\lambda ,\eta )\). We want to consider an infinitesimally small “parallelogram” whose four corners (moving counterclockwise with the first corner at the lower left) are given by \(x^a(\lambda ,\eta )\), \(x^a(\lambda ,\eta + \delta \eta )\), \(x^a(\lambda + \delta \lambda ,\eta + \delta \eta )\), and \(x^a(\lambda + \delta \lambda ,\eta )\). Generally speaking, any “movement” towards the right of the parallelogram is effected by varying \(\eta \), and ones towards the top results by varying \(\lambda \). The plan is to take a vector \(V^a(\lambda ,\eta )\) at the lower-left corner \(x^a(\lambda ,\eta )\), parallel transport it along a \(\lambda = \mathrm {const}\) curve to the lower-right corner at \(x^a(\lambda ,\eta + \delta \eta )\) where it will have the components \(V^a(\lambda ,\eta + \delta \eta )\), and end up by parallel transporting \(V^a\) at \(x^a(\lambda ,\eta + \delta \eta )\) along an \(\eta = \mathrm {const}\) curve to the upper-right corner at \(x^a(\lambda + \delta \lambda ,\eta + \delta \eta )\). We will call this path I and denote the final component values of the vector as \(V^a_\mathrm {I}\). We then repeat the process except that the path will go from the lower-left to the upper-left and then on to the upper-right corner. We will call this path II and denote the final component values as \(V^a_\mathrm {II}\).

Recalling Eq. (3.32) as the definition of parallel transport, we first of all have

$$\begin{aligned} V^a(\lambda ,\eta + \delta \eta ) \approx V^a(\lambda ,\eta ) + \delta _\eta V^a_\parallel (\lambda ,\eta ) = V^a(\lambda ,\eta ) - \varGamma ^a_{b c} V^b \delta _\eta x^c \end{aligned}$$
(3.58)

and

$$\begin{aligned} V^a(\lambda + \delta \lambda ,\eta ) \approx V^a(\lambda ,\eta ) + \delta _\lambda V^a_\parallel (\lambda ,\eta ) = V^a(\lambda ,\eta ) - \varGamma ^a_{b c} V^b \delta _\lambda x^c , \end{aligned}$$
(3.59)

where

$$\begin{aligned} \delta _\eta x^a \approx x^a(\lambda ,\eta + \delta \eta ) - x^a(\lambda ,\eta ) , \qquad \delta _\lambda x^a \approx x^a(\lambda + \delta \lambda ,\eta ) - x^a(\lambda ,\eta ) . \end{aligned}$$
(3.60)

Next, we need

$$\begin{aligned} V^a_\mathrm {I}\approx & {} V^a(\lambda ,\eta + \delta \eta ) + \delta _\lambda V^a_\parallel (\lambda ,\eta + \delta \eta ), \end{aligned}$$
(3.61)
$$\begin{aligned} V^a_\mathrm {II}\approx & {} V^a(\lambda + \delta \lambda ,\eta ) + \delta _\eta V^a_\parallel (\lambda + \delta \lambda ,\eta ). \end{aligned}$$
(3.62)

Working things out, we find that the difference between the two paths is

$$\begin{aligned} \varDelta V^a \equiv V^a_\mathrm {I} - V^a_\mathrm {II} = R^a{}_{d b c} V^d \delta _\lambda x^c \delta _\eta x^b , \end{aligned}$$
(3.63)

which follows because \(\delta _\lambda \delta _\eta x^a = \delta _\eta \delta _\lambda x^a\), i.e., we have closed the parallelogram.

3.5 The Einstein field equations

We now have the tools we need to outline the argument that leads to the field equations of General Relativity. This sketch will be complemented by a variational derivation in Sect. 4.4.

Consider two freely falling particles moving along neighbouring geodesics with a vector \(\xi ^a\) measuring the separation. Assuming that this vector is purely spatial according to the trajectory of one of the bodies, who we also assign to measure time (such that the corresponding four-velocity only has a time-component), we have

$$\begin{aligned} u^a \xi _a = 0 . \end{aligned}$$
(3.64)

The second derivative of the separation vector will be affected by the spacetime curvature. With this set-up it follows that

$$\begin{aligned} u^a \nabla _a \xi ^b - \xi ^a \nabla _a u^b = 0 \end{aligned}$$
(3.65)

and we find that

$$\begin{aligned} u^c \nabla _c (u^b\nabla _b \xi ^a) = u^c \xi ^b ( \nabla _c \nabla _b - \nabla _b \nabla _c) u^a = - R^a_{\ d b c} u^d \xi ^b u^c , \end{aligned}$$
(3.66)

where we have used the fact that the Riemann tensor encoded the failure of second covariant derivatives to commute. This is the equation of geodesic deviation.

At this point it is useful to introduce a total time derivative, such that

$$\begin{aligned} {D \over D\tau } = u^a \nabla _a \end{aligned}$$
(3.67)

which means that (3.66) becomes

$$\begin{aligned} {D^2 \xi ^a \over D\tau ^2} = - R^a_{\ d b c} u^d \xi ^b u^c . \end{aligned}$$
(3.68)

This provides us with an expression for the relative acceleration caused by the spacetime curvature. As gravity is a tidal interaction, we can meaningfully compare our relation to the corresponding relation in Newtonian gravity. This leads to the identification

$$\begin{aligned} R^j_{\ 0k0} = {\mathcal {E}}^j_{\ k} = \delta ^{jl} \left( { \partial ^2 \varPhi \over \partial x^l \partial x^k} \right) , \end{aligned}$$
(3.69)

where \(\mathcal {E}_j^{\ k}\) is the tidal tensor and \(\varPhi \) is the gravitational potential. This provides a constraint that the curved spacetime theory must satisfy (in the limit of weak gravity and low velocities).

After some deliberation, including a careful counting of the dynamical degrees of freedom (noting the freedom to introduce coordinates), one arrives at the field equations for General Relativity:

$$\begin{aligned} G_{a b} = {8 \pi G \over c^4} T_{a b} , \end{aligned}$$
(3.70)

where G is Newton’s constant and c is the speed of light.

At this point it is evident that any discussion of relativistic physics (involving matter) must include the energy-momentum-stress tensor,Footnote 13\(T_{a b}\). This is where the messy physics of reality enter the problem. Misner et al. (1973) refer to \(T_{a b}\) as “...a machine that contains a knowledge of the energy density, momentum density, and stress as measured by any and all observers at that event.” Encoding this is a severe challenge. However, we need to understand how this works—both phenomenologically (allowing us to move swiftly to the challenge of solving the equations) and from a detailed microphysics point of view (as required in order for our models to be realistic). We will develop this understanding step by step, starting with the simple perfect fluid model and proceeding towards more complex settings including distinct components exhibiting relative flows and dissipation. However, before we take the next step in this direction we need to introduce the main technical machinery that forms the basis for much of the discussion.

4 Variational analysis

The key geometric difference between generally covariant Newtonian fluids and their general relativistic counterparts is that the former have an a priori notion of time (Carter and Chamel 2004, 2005a, b). Newtonian fluids also have an a priori notion of space (cf. the discussion in Carter and Chamel 2004). Such a structure has clear advantages for evolution problems, where one needs to be unambiguous about the rate-of-change of a given system. However, once a problem requires, say, electromagnetism, then the a priori Newtonian time is at odds with the spacetime covariance of the electromagnetic fields (as the Lorentz invariance of Maxwell’s equations dictates that the problem is considered in—at least—Special Relativity). Fortunately, for spacetime covariant theories there is the so-called “3 + 1” formalism (see, for instance, Smarr and York 1978 and the discussion in Sect. 11) which allows one to define “rates-of-change” in an unambiguous manner, by introducing a family of spacelike hypersurfaces (the “3”) given as the level surfaces of a spacetime scalar (the “1”) associated with a timelike progression.

Something that Newtonian and relativistic fluids have in common is that there are preferred frames for measuring changes—those that are attached to the fluid elements. In the parlance of hydrodynamics, one refers to Lagrangian and Eulerian frames, or observers. In Newtonian theory, an Eulerian observer is one who sits at a fixed point in space, and watches fluid elements pass by, all the while taking measurements of their densities, velocities, etc. at the given location. In contrast, a Lagrangian observer rides along with a particular fluid element and records changes of that element as it moves through space and time. A relativistic Lagrangian observer is the same, but the relativistic Eulerian observer is more complicated to define (as we have to explain what we mean by a ”fixed point” in space). One way to do this, see Smarr and York (1978), is to define such an observer as one who moves along a worldline that remains everywhere orthogonal to the family of spacelike hypersurfaces.

The existence of a preferred frame for a fluid system can be a great advantage. In Sect. 5.2 we will use an “off-the-shelf” approach that exploits a preferred frame to derive the standard perfect fluid equations. Later, we will use Eulerian and Lagrangian variations to build an action principle for both single and multiple fluid systems. In this problem the Lagrangian displacements play a central role, as they allow us to introduce the constraints that are required in order to arrive at the desired results. Moreover, these types of variations turn out to be useful for many applications, e.g., they can be used as the foundation for a linearized perturbation analysis of neutron stars (Kokkotas and Schmidt 1999). As we will see, the use of Lagrangian variations is essential for establishing instabilities in rotating fluids (Friedman and Schutz 1978a, b). However, it is worth noting already at this relatively early stage that systems with several distinct flows are more complex as they can have as many notions of Lagrangian observers as there are fluids in the system (Fig. 6).

4.1 A simple starting point: the point particle

The simplest physics problem, i.e. the motion of a point particle, serves as a guide to deep principles used in much harder problems. We have used it already to motivate parallel transport as the foundation for the covariant derivative. Let us call upon the point particle again to set the context for the action-based derivation of the fluid equations. We will simplify the discussion by considering only motion in one dimension—assuring the reader that we have good reasons for this, and asking for patience while we remind him/her of what may be very basic facts.

Early on in life (relatively!) we learn that an action appropriate for the point particle is

$$\begin{aligned} I = \int ^{t_f}_{t_i} T dt = \int ^{t_f}_{t_i} \left( \frac{1}{2} m \dot{x}^2\right) dt, \end{aligned}$$
(4.1)

where m is the mass and T the kinetic energy. A first-order variation of the action with respect to x(t) yields

$$\begin{aligned} \delta I = - \int ^{t_f}_{t_i} \left( m \ddot{x}\right) \delta x dt+ \left. \left( m \dot{x} \delta x\right) \right| ^{t_f}_{t_i} , \end{aligned}$$
(4.2)

see Fig. 7. If this is all the physics to be incorporated, i.e. if there are no forces acting on the particle, then we impose d’Alembert’s principle of least action, which states that the trajectories x(t) that make the action stationary, i.e. \(\delta I = 0\), yield the true motion. We then see that functions x(t) that satisfy the boundary conditions

$$\begin{aligned} \delta x(t_i) = 0 = \delta x(t_f) , \end{aligned}$$
(4.3)

and the equation of motion

$$\begin{aligned} m \ddot{x} = 0 , \end{aligned}$$
(4.4)

will indeed make \(\delta I = 0\). The same logic applies in the substantially more difficult variational problems that will be considered later.

Fig. 7
figure 7

A simple illustration of the variation that leads to the point particle equations of motion. The solid line in this parameter space represents a curve which is understood to be a solution to the equations of motion, while the dashed line is some arbitrarily specified curve. At a given value of time, the variation \(\delta x\) represents the vertical displacement between the curves; obviously, at the endpoints \(t = t_1\) and \(t = t_2\), the two curves meet and the displacement vanishes. Keeping the endpoints fixed, the equations of motion are obtained from the extrema of the action, as demonstrated in the main text. The same idea applies in the more complicated cases of field theories that we consider later; the fields have actions, and the field equations of motion are obtained by locating the extrema. The field values at the extrema are often referred to as being “on shell’ (or “on the mass shell”) for reasons we do not really have to elaborate on here

figure c

In general we need to account for forces acting on the particle. First on the list are the so-called conservative forces, describable by a potential V(x), which are placed into the action according to:

$$\begin{aligned} I = \int ^{t_f}_{t_i} L(x,\dot{x}) dt = \int ^{t_f}_{t_i} \left[ \frac{1}{2} m \dot{x}^2 - V(x)\right] dt , \end{aligned}$$
(4.5)

where \(L = T - V\) is known as the Lagrangian. The variation now leads to

$$\begin{aligned} \delta I = - \int ^{t_f}_{t_i} \left( m \ddot{x} + \frac{\partial V}{\partial x}\right) \delta x dt + \left. \left( m \dot{x} \delta x\right) \right| ^{t_f}_{t_i} . \end{aligned}$$
(4.6)

Assuming no externally applied forces, d’Alembert’s principle yields the equation of motion

$$\begin{aligned} m \ddot{x} + \frac{\partial V}{\partial x} = 0 . \end{aligned}$$
(4.7)

An alternative way to write this is to introduce the momentum p (not to be confused with the fluid pressure introduced earlier) defined as

$$\begin{aligned} p = \frac{\partial L}{\partial \dot{x}} = m \dot{x} , \end{aligned}$$
(4.8)

in which case

$$\begin{aligned} \dot{p} + \frac{\partial V}{\partial x} = 0 . \end{aligned}$$
(4.9)

In the most honest applications, one has the obligation to incorporate dissipative, i.e., non-conservative, forces. Unfortunately, dissipative forces \(F_d\) cannot be put into action principles (at least not directly, see the discussion in Sect. 16 where we discuss recent progress towards dissipative variational models). Fortunately, Newton’s second law is great guidance, since it states

$$\begin{aligned} m \ddot{x} + \frac{\partial V}{\partial x} = F_d , \end{aligned}$$
(4.10)

when both conservative and dissipative forces act. A crucial observation of Eq. (4.10) is that the “kinetic” (\(m \ddot{x} = \dot{p}\)) and conservative (\(\partial V/\partial x\)) forces, which enter the left-hand side, still follow from the action, i.e.,

$$\begin{aligned} \frac{\delta I}{\delta x} = - \left( m \ddot{x} + \frac{\partial V}{\partial x}\right) , \end{aligned}$$
(4.11)

where we have introduced the “variational derivative” \({\delta I}/{\delta x}\). When there are no dissipative forces acting, the action principle gives us the appropriate equation of motion. When there are dissipative forces, the action defines the kinetic and conservative force terms that are to be balanced by the dissipative contribution. It also defines the momentum. These are the key lessons from this toy-problem.

We should emphasize that this way of using the action to define the kinetic and conservative pieces of the equation of motion, as well as the momentum, can also be used in situations when a system experiences an externally applied force \(F_\mathrm {ext}\). The force can be conservative or dissipative (see, e.g., Galley 2013), and will enter the equation of motion in the same way as \(F_d\) did above. That is

$$\begin{aligned} - \frac{\delta I}{\delta x} = F_d + F_\mathrm {ext} . \end{aligned}$$
(4.12)

Like a dissipative force, the main effect of the external force can be to siphon kinetic energy from the system. Of course, whether a force is considered to be external or not depends on the a priori definition of the system.

4.2 More general Lagrangians

Returning to the discussion of the variational approach for obtaining the dynamical equations that govern a given system, let us consider a generalized version of the problem. Basically, we want to extend the idea to the case of a field theory in spacetime. To do this, we assume that the system is described by a set of fields \(\varPhi ^A\) defined on spacetime, i.e., depending on the coordinates \(x^a\). At this level, we can keep the discussion abstract and consider any number of fields, labelled by A. This set can (in principle) contain any number of scalar, vector or tensor fields. If we are interested in models containing vector fields, then the label A runs over all four components of each of the relevant fields. In that situation, the label A essentially becomes a spacetime index, like a. Tensor fields are treated in a similar way. As an example, discussed in more detail later, consider electromagnetism, for which the set of fields would be the vector potential \(A^a\) and the spacetime metric \(g_{a b}\), so that we have \(\varPhi ^A = \{A^a, g_{a b}\}\).

The action for the system should now take the form of an integral of a Lagrangian (density) \(\mathcal {L}\), which depends on the fields \(\varPhi ^A\) and their various derivatives (as “appropriate”). Integrating over a spacetime region R we would have

$$\begin{aligned} I= \int _R \mathcal {L} \left( \varPhi ^A, \partial _a \varPhi ^A, \partial _a \partial _b \varPhi ^A, \ldots \right) d^4 x \end{aligned}$$
(4.13)

Since we expect the theory to be covariant, we need the action to transform as a scalar under a general coordinate transformation. To ensure this, we need to involve the invariant volume element \(\sqrt{-g}d^4x\), where g is the determinant of the metric, as before. Defining the scalar Lagrangian L we then have

$$\begin{aligned} I= \int _R L \sqrt{-g}\ d^4 x \end{aligned}$$
(4.14)

(which is a scalar by construction).

figure d

As in the case of a point particle, we can derive the field equations by demanding that the action is stationary under variations in the fields. Letting

$$\begin{aligned} \varPhi ^A \rightarrow \varPhi ^A + \delta \varPhi ^A \end{aligned}$$
(4.15)

and assuming, for simplicity, that the theory is “local” (meaning that only first derivatives of the fields appear in the action) we need also

$$\begin{aligned} \partial _a \varPhi ^A \rightarrow \partial _a \varPhi ^A + \partial _a \left( \delta \varPhi ^A\right) = \partial _a \varPhi ^A + \delta \left( \partial _a \varPhi ^A\right) \end{aligned}$$
(4.16)

Given these relation, the variation in the action is \(I+\delta I\), where

$$\begin{aligned} \delta I = \int _R \delta \mathcal {L} d^4 x = \int _R \left[ {\partial \mathcal {L} \over \partial \varPhi ^A} \delta \varPhi ^A + {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \delta \left( \partial _a \varPhi ^A\right) \right] d^4x \end{aligned}$$
(4.17)

To make progress we need to factor out \(\delta \varPhi ^A\) from the second term in the integrand. This is achieved by integrating by parts;

$$\begin{aligned}&\int _R {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \delta \left( \partial _a \varPhi ^A\right) \ d^4x \nonumber \\&\quad = \int _R \partial _a \left[ {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \delta \varPhi ^A \right] d^4x - \int _R \partial _a \left[ {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \right] \delta \varPhi ^A \ d^4x \end{aligned}$$
(4.18)

At this point we make use of the fact that the first term is a total derivative, which can be turned into a integral over the bounding surface (in the usual way). Inspired by the boundary conditions imposed on the variations in the point-particle case, we then restrict ourselves to variations \(\delta \varPhi ^A\) that vanish on the boundary. Thus, we can neglect the first integral (later referred to as the “surface terms”), ending up with

$$\begin{aligned} \delta I = \int _R \left\{ {\partial \mathcal {L} \over \partial \varPhi ^A}- \partial _a \left[ {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \right] \delta \varPhi ^A \right\} \delta \varPhi ^Ad^4x \end{aligned}$$
(4.19)

Demanding that \(\delta I=0\) we see that the variational derivative satisfies

$$\begin{aligned} {\delta \mathcal {L} \over \delta \varPhi ^A } = {\partial \mathcal {L} \over \partial \varPhi ^A}- \partial _a \left[ {\partial \mathcal {L} \over \partial \left( \partial _a \varPhi ^A\right) } \right] = 0 . \end{aligned}$$
(4.20)

These are the Euler-Lagrange equations that govern the evolution of the fields \(\varPhi ^A\).

So far, we have developed the theory for the Lagrangian density \(\mathcal {L}\), rather than the Lagrangian L itself. This is not a problem, we can simply consider the components of the metric as belonging to the set of fields that we vary. However, the added complication (due to the presence of \(\sqrt{-g}\) and the derivatives that need to be evaluated) may be unnecessary in many cases. In such situations one can often express the Lagrangian in terms of the covariant derivative \(\nabla _a\) instead of the partial \(\partial _a\). Essentially, this involves reworking the algebra taking as starting point an action of form

$$\begin{aligned} I = \int _R L \left( \varPhi ^A, \nabla _a \varPhi ^A, \ldots , g_{a b}, \partial _c g_{a b}, \ldots \right) \sqrt{-g}\ d^4x \end{aligned}$$
(4.21)

where the fields \(\varPhi ^A\) are now independent of the metric, although the Lagrangian may still contain \(g_{a b}\) in contractions of spacetime indices to construct the required scalar. After some algebra, we find that

$$\begin{aligned} {\delta L \over \delta \varPhi ^A } = {\partial L \over \partial \varPhi ^A} - \nabla _a \left[ {\partial L \over \partial \left( \nabla _a \varPhi ^A\right) } \right] = 0 \end{aligned}$$
(4.22)

This is the form of the Euler-Lagrange equations that we will be using in the following.

4.3 Electromagnetism

As a first “explicit” example of the variational approach, let us derive the field equations for electromagnetism (Hobson et al. 2006). In this case, the starting point is the electromagnetic vector potential \(A^a\), which (in turn) leads to the Faraday tensor

$$\begin{aligned} F_{ab}=\nabla _a A_b - \nabla _b A_a \end{aligned}$$
(4.23)

Because of the anti-symmetry, this object has 6 components which can (as we will see later) be associated with the electric and magnetic fields, leading to a (presumably) more familiar picture. However, these fields are manifestly observer dependent (a moving charge leads to a magnetic field etc.) so, from a formal point of view, it is better to develop the theory in terms of \(F_{ab}\). Making contact with the previous discussion and the variational approach, the fields \(\varPhi ^A\) to be varied will be the four components of \(A^a\). The first step of the derivation is to construct a suitable scalar Lagrangian from \(A^a\) and its first derivatives. However, already at this point do we run into “trouble”. We know that the theory is gauge-invariant, since we can add \(\nabla _a \psi = \partial _a \psi \) (where \(\psi \) is an arbitrary scalar) to the vector potential without altering the physics (read: \(F_{ab}\)). The upshot of this is that we need to ensure that the electromagnetic action is invariant under the transformation

$$\begin{aligned} A_a \rightarrow A_a + \nabla _a \psi \end{aligned}$$
(4.24)

This constrains the permissible Lagrangians. For example, we cannot use the contraction \(A^a A_a=g_{ab}A^a A^b\) since this combination is not gauge invariant. However, it is easy to see that \(F_{ab}\) exhibits the required invariance, so we can use it as our main building block. The obvious thing to do would be to try to use the scalar \(F_{a b} F^{a b}\) to build the Lagrangian. However, this would not account for the fact that the charge current \(j^a\) acts as source of the electromagnetic field. To reflect this, we add an “interaction term” \(-j^a A_a\) to the Lagrangian (leaving the details of this for later). At the end of the day, the Lagrangian takes the form

$$\begin{aligned} L= - {1 \over 4 \mu _0} F_{a b} F^{a b} + j^a A_a \end{aligned}$$
(4.25)

where \(\mu _0\) is a constant (describing the strength of the coupling).

At this point, we realize that the current term is not gauge-invariant. It would transform as

$$\begin{aligned} j^a A_a \rightarrow j^a A_a + j^a \nabla _a \psi = j^a A_a + \nabla _a \left( \psi j^a\right) - \psi \left( \nabla _a j^a\right) \end{aligned}$$
(4.26)

We already know that the second term contributes a surface term to the action integral, and hence can be “ignored”. The third term is different. In order to ensure that the action is gauge-invariant, we must demand that the current is conserved, i.e.

$$\begin{aligned} \nabla _a j^a = 0 . \end{aligned}$$
(4.27)

The field equations that we derive require this constraint to be satisfied. Later, when we consider the fluid problem, we will see that the conservation of the matter flux plays a similar role.

Having established an invariant scalar Lagrangian, we determine the Euler-Lagrange equations by varying the fields \(A_a\) (keeping the source \(j^a\) fixed). From (4.22) we then have

$$\begin{aligned} {\partial L \over \partial A_a} - \nabla _b \left[ {\partial L \over \partial \left( \partial _b A_a\right) } \right] = 0 . \end{aligned}$$
(4.28)

From the stated form of the action (and recalling the discussion of the point particle) we see that

$$\begin{aligned} {\partial L \over \partial A_a} = j^a \end{aligned}$$
(4.29)

The second term is messier, but after a bit of work we arrive at;

$$\begin{aligned} {\partial L \over \partial \left( \partial _b A_a\right) } = - {1 \over \mu _0} F^{a b} \end{aligned}$$
(4.30)

which leads to the final field equation

$$\begin{aligned} \nabla _b F^{a b} = \mu _0 j^a . \end{aligned}$$
(4.31)

The relativistic Maxwell equations are completed by

$$\begin{aligned} \nabla _{[c} F_{a b]} = 0 \quad \Longrightarrow \quad \nabla _c F_{a b} + \nabla _b F_{c a} + \nabla _a F_{b c} = 0 \end{aligned}$$
(4.32)

which is automatically satisfied for our definition of \(F^{a b}\), as it is anti-symmetric.

figure e

4.4 The Einstein field equations

Having discussed the underlying principles and considered the explicit example of electromagnetism, we have reached the level of confidence required to derive the field equations of General Relativity. We know that the metric \(g_{a b}\) is the central object of the theory (essentially, because we are looking for a theory where the geometry plays a key role). To build the Lagrangian we therefore want to construct a simple (for elegance) scalar from the metric and its derivatives. The simplest object we can think of is the Ricci scalar, R. This is, in fact, the only scalar that contains only the metric and its first two derivatives. Moreover, it is natural that the Lagrangian involves a quantity which is directly linked to the spacetime curvature, and the Ricci scalar fits this bill, as well.

This argument leads to the celebrated Einstein–Hilbert action

$$\begin{aligned} I_\mathrm {EH} = \int _R R \sqrt{-g}\ d^4x . \end{aligned}$$
(4.33)

In this case, where the Lagrangian depends on the metric, it is natural to work directly with the density \(\mathcal {L} = R \sqrt{-g}\). From (4.20) we then see that

$$\begin{aligned} {\partial \mathcal {L}\over \partial g_{a b} } - \partial _c \left[ { \partial \mathcal {L} \over \partial \left( \partial _c g_{a b}\right) }\right] +\partial _d \partial _c \left[ { \partial \mathcal {L} \over \partial \left( \partial _d \partial _c g_{a b}\right) }\right] = 0 , \end{aligned}$$
(4.34)

where we have allowed for the fact that the Lagrangian also depends on the second derivatives of the metric (the extension of the analysis to allow for this is straightforward). Having a go at evaluating the required derivatives, we soon appreciate that this task is formidable. Luckily, there is an easier way to arrive at the answer.

Let us consider the variation in the action that results from a metric variation \(g_{a b} \rightarrow g_{a b} + \delta g_{a b}\). Carrying out this analysis we need the variation of the covariant metric, which follows readily:

$$\begin{aligned} g^{a b} g_{b c} = \delta ^a_c \qquad \Longrightarrow \qquad \delta g^{a b} = - g^{a c} g^{b d} \delta g_{c d} . \end{aligned}$$
(4.35)

Making use of the fact that \(R = g^{a b} R_{a b}\), we then have

$$\begin{aligned} \delta I_\mathrm {EH} = \int _R \left[ \delta g^{a b} R_{a b} + g^{a b} \delta R_{a b} \right] \sqrt{-g}\ d^4x + \int _R g^{a b} R_{a b} \delta \sqrt{-g}\ d^4x . \end{aligned}$$
(4.36)

Since the metric is the fundamental variable, we need to factor out \(\delta g^{ab}\) (somehow). The terms in the second integral are easiest to deal with. Given that g is the determinant of the metric, the expression we need follows from (A.11). That is, we have

$$\begin{aligned} \delta \sqrt{-g} = -{1 \over 2} \sqrt{-g}\ g_{a b} \delta g^{a b} . \end{aligned}$$
(4.37)

Turning to the second term in the first bracket of (4.36), the easiest way to progress is to consider the variation of the Riemann tensor and then constructing the expression for the Ricci tensor by contraction. Moreover, noting that the Riemann tensor variation is expressed in terms of variations of the connection, \(\delta \varGamma ^c_{\ a b}\), which is a tensor, we can simplify the analysis by working in a local inertial frame (where \(\varGamma ^c_{\ a b}=0\)). Thus, we have

$$\begin{aligned} \delta R^d_{\ a b c} = \nabla _b \left( \delta \varGamma ^d_{\ a c}\right) - \nabla _c \left( \delta \varGamma ^d_{\ a b}\right) . \end{aligned}$$
(4.38)

As this is also a tensor expression it is valid in any coordinate system. Carrying out the required contraction, we find that

$$\begin{aligned} \delta R_{a b} = \nabla _b \left( \delta \varGamma ^c_{\ a c}\right) - \nabla _c \left( \delta \varGamma ^c_{\ a b}\right) . \end{aligned}$$
(4.39)

Using this expression we see that

$$\begin{aligned} g^{a b} \delta R_{a b} = \nabla _b \left( g^{a b} \delta \varGamma ^{c}_{\ a c} - g^{a c} \delta \varGamma ^b_{\ a c} \right) . \end{aligned}$$
(4.40)

In other words, the term that we need in (4.36) can be written as a total derivative. Given that this leads to a surface term, we duly neglect it and arrive at the final result:

$$\begin{aligned} \delta I_\mathrm {EH} = \int _R \left( R_{a b} -{1 \over 2} g_{a b} R \right) \delta g^{a b} \sqrt{-g} \ d^4x . \end{aligned}$$
(4.41)

The vanishing of the variation leads to the vacuum Einstein equations

$$\begin{aligned} G_{a b} = R_{a b} - {1 \over 2} g_{a b} R = 0 . \end{aligned}$$
(4.42)

The derivation highlights the fact that Einstein’s theory is one of the most elegant constructions of modern physics.

4.5 The stress-energy tensor as obtained from the action principle

However aesthetically pleasing the theory may be, our main interest here is not in the vacuum dynamics of Einstein’s theory. Rather, we want to explore the matter sector. In Einstein’s Universe, matter plays a dual role—it (actively) provides the origin of the spacetime curvature and the gravitational field and (perhaps not quite passively) adjusts its motion according to this curvature.

In particular, we want to explore systems of astrophysical relevance for which general relativistic aspects are crucial. Inevitably, this involves some rather complex physics. However, the coupling to the spacetime curvature remains relatively straightforward as it is encoded in a single object; the stress-energy tensor \(T_{a b}\). This object is as important for General Relativity as the Einstein tensor \(G_{a b}\) in that it enters the Einstein equations in as direct a way as possible, i.e. (in geometric units)

$$\begin{aligned} G_{a b} = 8 \pi T_{a b} . \end{aligned}$$
(4.43)

From a conceptual point-of-view it is relatively easy to incorporate matter in the variational derivation from the previous section. Essentially, we add a matter component such that (cf. the argument for electromagnetism)

$$\begin{aligned} I = I_\mathrm {EH} + I_\mathrm {M} = \int _R \left( {1\over 2\kappa } R + L \right) \sqrt{-g}\ d^4x \end{aligned}$$
(4.44)

where \(\kappa = 8\pi G/c^4\) is a coupling constant fixed by Newtonian correspondence in the weak-field limit. Given the results for the vacuum gravity problem, it is easy to see that the matter contribution to the field equations follow from the variation of the matter action with respect to the metric. This insight will be very important later. In essence, the Einstein equations take the form

$$\begin{aligned} G_{a b} = \kappa T_{a b} \end{aligned}$$
(4.45)

provided that

$$\begin{aligned} T_{a b} = - \frac{2}{\sqrt{- g}} { \delta \mathcal {L}_\mathrm {M} \over \delta g^{a b}}= - \frac{2}{\sqrt{- g}} { \delta \left( \sqrt{- g} L \right) \over \delta g^{a b}} , \end{aligned}$$
(4.46)

or, equivalently,

$$\begin{aligned} T^{a b}= \frac{2}{\sqrt{- g}} {\delta \left( \sqrt{- g} L \right) \over \delta g_{a b}} . \end{aligned}$$
(4.47)

Applying this result to the case of electromagnetism and (4.25), we see that the relevant stress-energy tensor takes the form

$$\begin{aligned} T_{a b}^\mathrm {EM} = - {1\over \mu _0} \left[ g^{c d}F_{a c}F_{b d}-{1\over 4}g_{a b} \left( F_{c d}F^{c d}\right) \right] . \end{aligned}$$
(4.48)

5 Case study: single fluids

Without an a priori, physics-based specification for \(T_{a b}\), solutions to the Einstein equations are void of physical content, a point which has been emphasized, for instance, by Geroch and Horowitz (in Hawking and Israel 1979). Unfortunately, the following algorithm for producing “solutions” has been much abused: (i) specify the form of the metric, typically by imposing some type of symmetry (or symmetries), (ii) work out the components of \(G_{a b}\) based on this metric, (iii) define the energy density to be \(G_{0 0}\) and the pressure to be \(G_{1 1}\), say, and thereby “solve” those two equations, and (iv) based on the “solutions” for the energy density and pressure solve the remaining Einstein equations. The problem is that this algorithm is little more than a mathematical parlour game. It is only by sheer luck that it will generate a physically relevant solution for a non-vacuum spacetime. As such, the strategy is antithetical to the raison d’être of, say, gravitational-wave astrophysics, which is to use observed data as a probe of the microphysics, say, in the cores of neutron stars. Much effort is currently going into taking given microphysics and combining it with the Einstein equations to model gravitational-wave emission from astrophysical scenarios, like binary neutron star mergers (Baiotti and Rezzolla 2017). To achieve this aim, we need an appreciation of the stress-energy tensor and how it is encodes the physics.

5.1 General stress decomposition

Readers familiar with Newtonian fluids will be aware of the roles that the internal energy (recall the discussion in Sect. 2), the particle flux, and the stress tensor play in the fluid equations. In special relativity we learn that, in order to have spacetime covariant theories (e.g., well-behaved with respect to the Lorentz transformation) energy and momentum must be combined into a spacetime vector, whose zeroth component is the energy while the spatial components give the momentum (as measured by a given observer). The fluid stress must also be incorporated into a spacetime object, hence the necessity for \(T_{a b}\). Because the Einstein tensor’s covariant divergence vanishes identically, we must have

$$\begin{aligned} \nabla _b T^b{}_a = 0 . \end{aligned}$$
(5.1)

This provides us with four equations, often interpreted as the equations for relativistic fluid dynamics. As we will soon see, this interpretation makes “sense” (as the equations we arrive at reduce to the familiar Newtonian ones in the appropriate limit). However, from a formal point of view the argument is somewhat misleading. It leaves us with the impression that the job is done, but this is not (quite) the case. Sure, we are able to speedily write down the equations for a perfect fluid. But, we still have work to do if we want to consider more complex settings (e.g., including relative flows). This requires additional assumptions or a different approach altogether. One of the main aims with this review is to develop such an alternative and explore the results in a variety of settings. Having done this, we will see that (5.1) follows automatically once the “fluid equations” are satisfied. This may seem like splitting hairs at the moment, but the point we are trying to make should become clear as we progress.

The fact that we advocate a different strategy does not mean that the importance of the stress-energy tensor is (somehow) reduced. Not at all. We still need \(T_{ab}\) to provide the matter input for the Einstein equations and we may opt to use (5.1) to get (some of) the dynamical equations we need. Given this, it is important to understand the physical meaning of the components of \(T_{a b}\). In order to do this, we need to introduce a suitable observer (someone has to measure energy etc. for us). This then allows us to express the tensor components in terms of projections into the timelike and spacelike directions associated with this observer, in essence providing a fibration of spacetime as illustrated in Fig. 3.

In order to project a tensor along an observer’s timelike direction we contract that index with the observer’s four-velocity, \(U^a\). The required projection of a tensor into spacelike directions perpendicular to the timelike direction defined by \(U^a\) is effected via the operator \(\perp ^a_b\), defined as

$$\begin{aligned} \perp ^a_b = \delta ^a{}_b + U^a U_b , \qquad U^a U_a = - 1 \quad \Longrightarrow \quad \perp ^a_b U^b = 0 \end{aligned}$$
(5.2)

Any tensor index that has been “hit” with the projection operator will be perpendicular to the timelike direction defined (locally) by \(U^a\). It is then easy to see that any vector can be expressed in terms of its component along a given \(U^a\) and components orthogonal (in the spacetime sense) to it. That is, we have

$$\begin{aligned} V^a = \delta ^a_b V^b + \underbrace{(U^a U_b V^b - U^a U_b V^b)}_{=0} = -(U_bV^b) U^a + \perp ^a_b V^b \end{aligned}$$
(5.3)

The two projections (of a vector \(V^a\) for an observer with unit four-velocity \(U^a\)) are illustrated in Fig. 8. More general tensors are projected by acting with \(U^a\) or \(\perp ^a_b\) on each index separately (i.e., multi-linearly).

Fig. 8
figure 8

The projections of a vector \(V^a\) onto the worldline defined by \(U^a\) (providing a fibration of spacetime) into the perpendicular hypersurface (obtained from a projection with \(\perp ^a_b\))

Let us now see how we can use the projection to give physical “meaning” to the components of the stress-energy tensor. The energy density \(\varepsilon \) as perceived by the observer is (see Eckart 1940 for one of the earliest discussions)

$$\begin{aligned} \varepsilon = U^a U^b T_{a b} , \end{aligned}$$
(5.4)

while

$$\begin{aligned} \mathcal{P}_a = - \perp ^b_a U^c T_{b c} \end{aligned}$$
(5.5)

is the spatial momentum density (as it does not have a contribution along \(U^a\) it is a three vector), and the spatial stresses are encoded in

$$\begin{aligned} \mathcal{S}_{a b} = \perp ^c_a \perp ^d_b T_{c d} . \end{aligned}$$
(5.6)

As usual, the manifestly spatial component \(\mathcal {S}_{i j}\) is understood to be the ith-component of the force across a unit area perpendicular to the jth-direction. With respect to the observer, the stress-energy tensor can now be written (in complete generality) as

$$\begin{aligned} T_{a b} = \varepsilon \, U_a U_b + 2 U_{(a} \mathcal{P}_{b)} + \mathcal{S}_{a b}, \end{aligned}$$
(5.7)

where \(2 U_{(a} \mathcal{P}_{b)} \equiv U_a \mathcal{P}_b + U_b \mathcal{P}_a\). Because \(U^a \mathcal{P}_a = 0\), we see that the trace \(T = T^a{}_a\) is

$$\begin{aligned} T = \mathcal{S} - \varepsilon , \end{aligned}$$
(5.8)

where \(\mathcal{S} = \mathcal{S}^a{}_a\).

It is important at this stage to appreciate that we are discussing a mathematical construction. We need to take further steps to connect the phenomenology to the underlying physics.

5.2 “Off-the-shelf” analysis

As we have already suggested, there are different ways of deriving the general relativistic fluid equations. Our purpose here is not to review all possible approaches, but rather to focus on a couple: (i) an “off-the-shelf” consistency analysis for the simplest fluid a la Eckart (1940), to establish some of the key ideas, and then (ii) a more powerful method based on an action principle that varies fluid element world lines. We now consider the first of these. The second avenue will be explored in Sect. 6.

We have seen how the components of a general stress-energy tensor can be projected onto a coordinate system carried by an observer moving with four-velocity \(U^a\). Let us now connect this with the motion of a fluid. The simplest fluid is one for which there is only one four-velocity \(u^a\). As both four velocities are normalized (to unity) we must have

$$\begin{aligned} u^a = \gamma (U^a + v^a) , \quad \text{ with } \quad U_a v^a = 0 \quad \text{ and } \quad \gamma = (1-v^2)^{-1/2} \end{aligned}$$
(5.9)

the familiar redshift factor from special relativity. Clearly, the problem simplifies if we assume that the observer rides along with the fluid. That is, we introduce a preferred frame defined by \(u^a\), and then simply take \(U^a = u^a\). With respect to the fluid there will then (by definition) be no momentum flux, i.e., \(\mathcal{P}_a = 0\). Moreover, since we use a fully spacetime covariant formulation, i.e., there are only spacetime indices, the resulting stress-energy tensor will transform properly under general coordinate transformations, and hence can be used for any observer.

In general, the spatial stresses are given by a two-index, symmetric tensor, and the only objects that can be used to carry the indices (in the simple model we are considering at this point) are the four-velocity \(u^a\) and the metric \(g_{a b}\). Furthermore, because the spatial stress must also be symmetric, the only possibility is a linear combination of \(g_{a b}\) and \(u^a u^b\). Given that \(u^b \mathcal{S}_{b a} = 0\), we must have

$$\begin{aligned} \mathcal{S}_{a b} = \frac{1}{3} \mathcal{S} (g_{a b} + u_a u_b). \end{aligned}$$
(5.10)

As the system is assumed to be locally isotropic, it is possible to diagonalize the spatial stress tensor. This also implies that its three independent diagonal elements should actually be equal to the same quantity, which turns out to be the local pressure. Hence we have \(p = \mathcal{S}/3\) and

$$\begin{aligned} T_{a b} = \left( \varepsilon + p\right) u_a u_b + p g_{a b} = \varepsilon u_a u_b + p \perp _{ab} . \end{aligned}$$
(5.11)

This is the well-established result for a perfect fluid.

Given a relation \(p = p(\varepsilon )\) (an equation of state), there are four independent fluid variables. Because of this the equations of motion are often understood to be given by (5.1). Let us proceed along these lines, but first simplify matters by assuming that the equation of state is given by a relation of the form \(\varepsilon = \varepsilon (n)\) where n is the particle number density. As discussed in Sect. 2, the chemical potential \(\mu \) is then given by

$$\begin{aligned} {d} \varepsilon = \frac{d \varepsilon }{d n} {d} n \equiv \mu \, {d} n , \end{aligned}$$
(5.12)

and we know from the Euler relation (2.8) that

$$\begin{aligned} \mu n = p + \varepsilon . \end{aligned}$$
(5.13)

In essence, we have connected the model to the thermodynamics. This is an important step.

Let us now get rid of the free index of \(\nabla _b T^b{}_a = 0\) in two ways: first, by contracting with \(u^a\) and second, by projecting with \(\perp ^a_b\) (recalling that \(U^a = u^a\)). Given that \(u^a u_a = - 1\) we have the identity

$$\begin{aligned} \nabla _a \left( u^b u_b\right) = 0 \qquad \Longrightarrow \qquad u_b \nabla _a u^b = 0. \end{aligned}$$
(5.14)

Contracting (5.1) with \(u^a\) and using this identity gives

$$\begin{aligned} u^a \nabla _a \varepsilon + (\varepsilon + p) \nabla _a u^a = 0 . \end{aligned}$$
(5.15)

The definition of the chemical potential \(\mu \) and the Euler relation allow us to rewrite this as

$$\begin{aligned} \mu u^a \nabla _a n + \mu n \nabla _a u^a = 0 \qquad \Longrightarrow \qquad \nabla _a n^a = 0 , \end{aligned}$$
(5.16)

where we have introduced the particle flux, \(n^a \equiv n u^a\). This result simply represents the fact that the particles are conserved.

Meanwhile, projection of the free index in (5.1) using \(\perp ^b_a\) leads to

$$\begin{aligned} (\varepsilon + p) a_a = - \perp ^b_a \nabla _b p , \end{aligned}$$
(5.17)

where \(a_a \equiv u^b \nabla _b u_a\) is the fluid (four) acceleration. This is reminiscent of the Euler equation for Newtonian fluids. In fact, we demonstrate in Sect. 7.1 that the non-relativistic limit of (5.17) leads to the Newtonian result.

However, we should not be too quick to think that this is the only way to understand (5.1)! There is an alternative form that makes the perfect fluid have more in common with vacuum electromagnetism. If we define

$$\begin{aligned} \mu _a = \mu u_a , \end{aligned}$$
(5.18)

then the stress-energy tensor can be written in the form

$$\begin{aligned} T^a{}_b = p \delta ^a{}_b + n^a \mu _b . \end{aligned}$$
(5.19)

We have here our first encounter with the fluid element momentum \(\mu _a\) that is conjugate to the particle flux, the number density current \(n^a\). Its importance will become clearer as this story develops, particularly when we discuss the multi-fluid problem. For now, we simply note that \(u_a {d} u^a = 0\) implies that we will have

$$\begin{aligned} {d} \varepsilon = - \mu _a \, {d} n^a . \end{aligned}$$
(5.20)

This relation will serve as the starting point for the fluid action principle in Sect. 6, where \(- \varepsilon \) will be taken to be the fluid Lagrangian.

If we project onto the free index of (5.1) using \(\perp ^b_a\), as before, we arrive at

$$\begin{aligned} f_a + \left( \nabla _b n^b \right) \mu _a = 0 , \end{aligned}$$
(5.21)

where the force density \(f_a\) is

$$\begin{aligned} f_a = n^b \omega _{b a} , \end{aligned}$$
(5.22)

and the vorticity \(\omega _{a b}\) is defined as

$$\begin{aligned} \omega _{a b} \equiv 2 \nabla _{[ a} \mu _{b ]} = \nabla _a \mu _b - \nabla _b \mu _a . \end{aligned}$$
(5.23)

Contracting Eq. (5.21) with \(n^a\) we see (since \(\omega _{a b} = - \omega _{b a}\)) that

$$\begin{aligned} \nabla _a n^a = 0 \end{aligned}$$
(5.24)

and, as a consequence, the equations of motion take the form

$$\begin{aligned} f_a = n^b \omega _{b a} = 0 . \end{aligned}$$
(5.25)

The vorticity two-form \(\omega _{a b}\) has emerged quite naturally as an essential ingredient of the fluid dynamics (Lichnerowicz 1967; Carter 1989a; Bekenstein 1987; Katz 1984). This is a key result. Readers familiar with Newtonian fluids should be inspired by this, as the vorticity is used to establish theorems on fluid behaviour (for instance the Kelvin–Helmholtz theorem; Landau and Lifshitz 1959) and is at the heart of turbulence modelling (Pullin and Saffman 1998).

figure f

To demonstrate the role of \(\omega _{a b}\) as the vorticity, consider a small region of the fluid where the time direction \(t^a\), in local Minkowski coordinates, is adjusted to be the same as that of the fluid four-velocity so that \(u^a = t^a = (1,0,0,0)\). Eq. (5.25) and the antisymmetry then imply that \(\omega _{a b}\) can only have purely spatial components. Because the rank of \(\omega _{a b}\) is two, there are two “nulling” vectors, meaning their contraction with either index of \(\omega _{a b}\) yields zero (a condition which is true also for vacuum electromagnetism). We have arranged already that \(t^a\) be one such vector. By a suitable rotation of the coordinate system the other one can be taken to be \(z^a = (0,0,0,1)\), implying that the only non-zero component of \(\omega _{a b}\) is \(\omega _{x y}\).

Geometrically, this kind of two-form can be pictured as a collection of oriented worldtubes, whose walls lie in the \(x = \mathrm {const}\) and \(y = \mathrm {const}\) planes (Misner et al. 1973). Any contraction of a vector with a two-form that does not yield zero implies that the vector pierces the walls of the worldtubes. But when the contraction is zero, as in Eq. (5.25), the vector does not pierce the walls. This is illustrated in Fig. 9, where the red circles indicate the orientation of each world-tube. The individual fluid element four-velocities lie in the centers of the world-tubes. Finally, consider the closed contour in Fig. 9. If that contour is attached to fluid-element worldlines, then the number of worldtubes contained within the contour will not change because the worldlines cannot pierce the walls of the worldtubes. This is essentially the Kelvin–Helmholtz theorem on the conservation of vorticity. From this we learn that the Euler equation is (in fact) an integrability condition which ensures that the vorticity two-surfaces mesh together to fill spacetime.

Fig. 9
figure 9

A local, geometrical view of the Euler equation as an integrability condition of the vorticity for a single-constituent perfect fluid

figure g
figure h

5.3 Conservation laws

The variational model we will develop contains the same information as the standard approach (a point that is emphasized by the Newtonian limit in Sect. 7.1)—as it must if we want it to be useful—but it is more directly linked to the conservation of vorticity. In fact, the definition of the vorticity implies that its exterior derivative vanishes. This means that

$$\begin{aligned} \nabla _{[a} \omega _{bc]} = 0. \end{aligned}$$
(5.29)

Whenever the Euler equation (5.25) holds, this leads to the vorticity being conserved along the flow. That is, we have

$$\begin{aligned} \mathcal {L}_u \omega _{ab} = 0. \end{aligned}$$
(5.30)

The upshot of this is that, Eq. (5.25) can be used to discuss the conservation of vorticity in an elegant way. It can also be used as the basis for a derivation of other theorems in fluid mechanics.

As is well-known, constants of motion are often associated with symmetries of the problem under consideration. In General Relativity, spacetime symmetries can be expressed in terms of Killing vectors, \(\hat{\xi }^a\) (the hat is used to make a distinction from the Lagrangian displacement later). As an example, let us assume that the spacetime does not depend on the coordinate \(a = X\). The corresponding Killing vector would be

$$\begin{aligned} \hat{\xi }^a = \delta ^a_X {\partial \over \partial X}, \end{aligned}$$
(5.31)

and the symmetry leads to Killing’s equation

$$\begin{aligned} \mathcal {L}_{\hat{\xi }} g_{a b} = 0 \qquad \Longrightarrow \qquad \nabla _a \hat{\xi }_b + \nabla _b \hat{\xi }_a = 0 . \end{aligned}$$
(5.32)

Associated with each such Killing vector will be a conserved quantity. In the vacuum case, it is easy to combine the geodesic equation

$$\begin{aligned} u^b \nabla _b u_a = 0 , \end{aligned}$$
(5.33)

with Killing’s equation to show that

$$\begin{aligned} u^b \nabla _b \left( \hat{\xi }^a u_a\right) = {d \over d\tau } \left( \hat{\xi }^a u_a\right) = 0 . \end{aligned}$$
(5.34)

In other words, the combination \(\hat{\xi }^a u_a\) remains constant along each geodesic.

Let us now consider how this argument extends to the fluid case. Assuming that the flow is invariant with respect to transport by the vector field \(\hat{\xi }^a\), we have

$$\begin{aligned} \mathcal {L}_{\hat{\xi }} \mu _a = 0 , \qquad \Longrightarrow \qquad \hat{\xi }^b \nabla _b \mu _a + \mu _b \nabla _a\hat{\xi }^b = 0 . \end{aligned}$$
(5.35)

Now combine this with the equation of motion in the form (5.25) to find

$$\begin{aligned} \hat{\xi }^a n^b \left( \nabla _b \mu _a - \nabla _a \mu _b\right) = n^b \nabla _b \left( \hat{\xi }^a \mu _a\right) = 0 . \end{aligned}$$
(5.36)

Since \(n^a = n u^a\) we see that the quantity \(\hat{\xi }^a \mu _a\) is conserved along the fluid world lines, reminding us of the vacuum result. The difference is due to the fact that pressure gradients in the fluid leads to the flow no longer being along geodesics. One may consider two specific situations. If \(\hat{\xi }^a\) is taken to be the four-velocity, then the scalar \(\hat{\xi }^a \mu _a\) represents the “energy per particle”. If instead \(\hat{\xi }^a\) represents an axial generator of rotation, then the scalar will correspond to an angular momentum. For the purposes of the present discussion we can leave \(\hat{\xi }^a\) unspecified, but it is still useful to keep these possibilities in mind.

Given that the flux is conserved, i.e. (5.16) holds, we can take one further step to show that we have

$$\begin{aligned} n^a \nabla _a \left( \mu _b \hat{\xi }^b\right) = \nabla _a \left( n^a \mu _b\hat{\xi }^b\right) = 0 , \end{aligned}$$
(5.37)

and we have shown that \(n^a \mu _b \hat{\xi }^b\) is a conserved quantity.

In many cases one can also obtain integrals of the motion, analogous to the Bernoulli equation for stationary rotating Newtonian fluids. Quite generally, the derivation proceeds as follows. Assume that \(\hat{\xi }^a\) is such that

$$\begin{aligned} \hat{\xi }^b \omega _{b a} = 0 . \end{aligned}$$
(5.38)

This condition can be written

$$\begin{aligned} \mathcal {L}_{\hat{\xi }} \mu _a - \nabla _a \left( \hat{\xi }^b \mu _b \right) = 0 \end{aligned}$$
(5.39)

where the first term vanishes as long as (5.35) holds. Hence, we arrive at the first integral

$$\begin{aligned} \nabla _a \left( \hat{\xi }^b \mu _b \right) = 0 \qquad \Longrightarrow \qquad \hat{\xi }^b \mu _b = \mathrm {constant} . \end{aligned}$$
(5.40)

An obvious version of this analysis is an irrotational flow, when \(\omega _{a b} = 0\). Another situation of direct astrophysical interest is “rigid” flow—when \(\hat{\xi }^a = \lambda u^a\) for some scalar field \(\lambda \). Rotating compact stars, in equilibrium, belong to this category. In that case, one would have \(\hat{\xi }^a =t^a + \varOmega \phi ^a\), where \(\varOmega \) is the rotation frequency and \(t^a\) and \(\phi ^a\) represent the timelike Killing vector and the spatial Killing vector associated with axisymmetry, respectively (the system permits a helical Killing vector).

5.4 A couple of steps towards relative flows

With the comments at the close of the previous section, we have reached the end of the road as far as the “off-the-shelf” strategy is concerned. We will now move towards an action-based derivation of the fluid equations of motion. As a first step, let us look ahead to see what is coming and why we need to go in this direction.

Return to the perfect fluid stress-energy tensor but now let us not associate the observer with the fluid flow. The thermodynamical relations still hold in the co-moving (fluid) frame associated with \(u^a\), but the observer sees the fluid flow by with the relative velocity \(v^a\) from (5.9). In essence, we then have

$$\begin{aligned} T_{ab} \!= & {} \! (p\!+\!\varepsilon )\gamma ^2 (U_a\!+\!v_a)(U_b +v_b)+pg_{ab} \nonumber \\ \!= & {} \! \varepsilon \gamma ^2 U_a U_b \!+\!p(U_aU_b\!+\!g_{ab})\!+\!2 (p\!+\!\varepsilon )\gamma ^2 U_{(a}v_{b)} \!+\! (p\!+\!\varepsilon )\gamma ^2 v_a v_b \end{aligned}$$
(5.41)

We learn several important lessons from this. The perfect fluid does not seem quite so simple in the frame of a general observer. First of all, the different thermodynamical quantities will be redshifted (as expected from Special Relativity) so we need to keep track of the \(\gamma \) factors. Secondly, we now appear to have both a momentum flux and anisotropic spatial stresses. In order to arrive at the main point we want to make, let us assume that the relative velocity is small enough that we can linearize the problem. As we will see later, this should be an adequate assumption in many situations of interest. Leaving out terms quadratic in \(v^a\) we lose the spatial stresses and \(\gamma \rightarrow 1\) (which is convenient as the thermodynamics then remains as before). We are left with

$$\begin{aligned} T_{ab} \approx \varepsilon U_a U_b +p(U_aU_b+g_{ab})+2 (p+\varepsilon ) U_{(a}v_{b)} . \end{aligned}$$
(5.42)

At this point, we can make use of the freedom to choose the observer. We may return to the case where the observer rides along with the fluid by setting (\(v^a=0\)). This choice is commonly called the Eckart frame, as it was first introduced in the discussion of relativistic heat flow (see Sect. 15). This is the obvious choice for a single fluid problem, but when we are dealing with multiple flows there are alternatives.

As an illustration, in the case of a problem with both matter and heat flowing, we have to replace the stress energy tensor by (don’t worry, we will derive this later)

$$\begin{aligned} T_{ab} \approx p g_{ab} + n\mu u_a u_b + sT u^\mathrm {s}_a u^\mathrm {s}_b , \end{aligned}$$
(5.43)

where s and T are the entropy (density) and temperature, respectively, and \(u_\mathrm {s}^a\) accounts for the heat flux. We have assumed that both flows may be linearized relative to the observer so

$$\begin{aligned} u_\mathrm {s}^a \approx U^a + q^a , \quad \text{ with } \quad U^aq_a = 0 , \end{aligned}$$
(5.44)

where \(q^a\) is the heat flux. This means that we have

$$\begin{aligned} T_{ab}\approx & {} p g_{ab} + (n\mu +sT) U_a U_b + 2 n\mu U_{(a}v_{b)} + 2 sT U_{(a}q_{b)} \nonumber \\= & {} \varepsilon U_a U_b +p(U_aU_b+g_{ab}) + 2 n\mu U_{(a}v_{b)} + 2 sT U_{(a}q_{b)} . \end{aligned}$$
(5.45)

In this case, the momentum flux relative to the observer will be

$$\begin{aligned} \mathcal {P}_a = - \perp ^b_a U^c T_{b c} = n\mu v_a + sT q_a . \end{aligned}$$
(5.46)

Basically, an observer riding along with the matter will experience heat flowing. We may, however, work with a different observer according to whom no energy flows. It is easy to see that this involves setting

$$\begin{aligned} v_a = -{sT \over n\mu } q_a = -{sT \over p+\varepsilon -sT} q_a . \end{aligned}$$
(5.47)

With this choice we are left with

$$\begin{aligned} T_{ab} \approx \varepsilon U_a U_b + p ( g_{ab} + U_a U_b ) , \end{aligned}$$
(5.48)

reminding us of the perfect fluid situation, even though we are considering a more complicated problem. It follows that

$$\begin{aligned} U^a T^b_{\ a} = - \varepsilon U^b . \end{aligned}$$
(5.49)

Formally, the energy density \(\varepsilon \) is an eigenvalue of the stress-energy tensor (with the observer four velocity \(U^a\) the corresponding eigenvector). This choice of observer is usually referred to as the Landau–Lifshitz frame (Landau and Lifshitz 1959).

We are free to work with whatever observer we like—different options have different advantages—but there is no free lunch. For example, with the Landau–Lifshitz choice the fluid equations simplify, but the particle conservation law becomes more involved. We now have

$$\begin{aligned} \nabla _a n^a \approx \nabla _a ( nU^a + nv^a) = \nabla _a \left( nU^a - {n sT \over p+\varepsilon -sT} q^a \right) = 0 . \end{aligned}$$
(5.50)

The contribution from the heat flux is not particularly intuitive.

The main lesson we learn from this exercise is that any situation with relative flows involves making choices, and we have to keep careful track of how these choices impact on the connection with the underlying physics. This motivates the formal development of the variational approach for general relativistic multifluid systems, to be described in Sect. 9.

5.5 From microscopic models to the equation of state

We have discussed how the equations for relativistic fluid dynamics relate to a given stress-energy tensor, involving a set of suitably averaged variables (energy, pressure, four-velocity etc.). We have also seen how one can obtain the equations of motion from

$$\begin{aligned} \nabla _a T^{ab} = 0 , \end{aligned}$$
(5.51)

as required by the Einstein field equations (by virtue of the Bianchi identities). Moreover, in Sect. 4 we showed how the stress-energy tensor can be obtained via a variation of the Lagrangian with respect to the spacetime metric. This description is neatly self-consistent—and we will make frequent use of it later—but it is helpful to pause and consider the logic. In principle, the relation (5.51) follows from the fact that the Einstein tensor \(G_{ab}\) is divergence free, which in turn represents the fact that the problem involves four “unphysical” degrees of freedom, usually taken to mean that we have the freedom to choose the four spacetime coordinates. However, by turning (5.51) into the equations for fluid dynamics we are changing the perspective. The four degrees of freedom now represent the conservation of energy and momentum. Why are we allowed to do this? Is it simply a fluke that the four degrees of freedom involved can be suitably interpreted in a manner that fits out purpose? One can argue that this is, indeed, the case and we will discuss this later.

For the moment, we want to consider a different aspect of the problem. If it is the case that (5.51) encodes the fluid equations of motion, then there ought to be a way to derive the stress-energy tensor from some underlying microscopical theory (presumably involving quantum physics). This issue turns out to be somewhat involved. As a starting point, suppose we focus on a one-parameter system, with the parameter being the particle number density. The equation of state will then be of the form \(\varepsilon = \varepsilon (n)\), representing the energy per particle. In many-body physics (as studied in condensed matter, nuclear, and particle physics) one can then in principle construct the quantum mechanical particle number density \(n_{\mathrm {QM}}\), stress-energy tensor \(T^{\mathrm {QM}}_{a b}\), and associated conserved particle number density current \(n^a_{\mathrm {QM}}\) (starting from some fundamental Lagrangian, say; cf. Walecka 1995; Glendenning 1997; Weber 1999). But unlike in quantum field theory in a curved spacetime (Birrell and Davies 1982), one typically assumes that the matter exists in an infinite Minkowski spacetime.

Once \(T^{\mathrm {QM}}_{a b}\) is obtained, and after (quantum mechanical and statistical) expectation values with respect to the system’s (quantum and statistical) states are taken, one defines the energy density as

$$\begin{aligned} \varepsilon = u^a u^b \langle T^{\mathrm {QM}}_{a b} \rangle , \end{aligned}$$
(5.52)

where

$$\begin{aligned} u^a \equiv \frac{1}{n} \langle n^a_{\mathrm {QM}} \rangle , \qquad n = \langle n_{\mathrm {QM}} \rangle . \end{aligned}$$
(5.53)

Similarly, the pressure is obtained as

$$\begin{aligned} p = \frac{1}{3} \left( \langle T^{\mathrm {QM} a}{}_a \rangle + \varepsilon \right) \end{aligned}$$
(5.54)

and it will also be a function of n.

One must be very careful to distinguish \(T^{\mathrm {QM}}_{a b}\) from \(T_{a b}\). The former describes the states of elementary particles with respect to a fluid element, whereas the latter describes the states of fluid elements with respect to the system. Comer and Joynt (2003) have shown how this line of reasoning applies to the two-fluid case.

This outline description stays close to the fluid picture, but it does not shed much light on the origin of \(T^{\mathrm {QM}}_{a b}\). This is where we run into “trouble”. A typical field theory description would take a given symmetry of the system as its starting point, and then obtain equations of motion for conserved quantities associated with this symmetry. Let us consider this problem in flat space and use a scalar field with Lagrangian \(L=L(\phi , \partial _a \phi )\) as our example. Assuming that the system is symmetric under spacetime translations, we have four conserved (Noether) currents given by

$$\begin{aligned} \tau ^a_{\ b} = {\partial L \over \partial ( \partial _a \phi )} \partial _b \phi - \delta ^a_b L . \end{aligned}$$
(5.55)

That is, we have

$$\begin{aligned} \partial _a \tau ^a_{\ b} = 0 , \end{aligned}$$
(5.56)

which follows by virtue of the Euler-Lagrange equations:

$$\begin{aligned} \partial _a \left( {\partial L \over \partial ( \partial _a \phi )} \right) - {\partial L \over \partial \phi } = 0 , \end{aligned}$$
(5.57)

and the fact that we are working in flat space (so partial derivatives commute). It may seem tempting to take \(\tau ^a_{\ b}\) to be the stress-energy tensor—intuitively, we can change partial derivatives to covariant ones, introduce the spacetime metric (instead of \(\eta ^{ab}\), as appropriate), to arrive at an expression similar to (5.51). However, the Devil is in the detail. The flat-space field equations represent a true conservation law (with four conserved currents, one for each value of b in (5.56)), which is what we expect, but \(\tau ^a_{\ b}\) is (in general) not symmetric. Since symmetry is required for the gravitational stress-energy tensor \(T^{ab}\) (as long as we do not deviate from Einstein’s theory) we have a problem. The issue is resolved by invoking the Belinfante-Robinson “correction” to \(\tau ^a_{\ b}\) (see for example Ilin and Paston 2018 for a recent discussion). This is a uniquely defined object which effects the change from a flat to a curved spacetime. While we will not need to understand the details of this procedure to make progress, it is important to be aware of it.

6 Variational approach for a single-fluid system

Let us now consider the single-fluid problem from a different perspective and derive the equations of motion and the stress-energy tensor from an action principle. The ideas behind this variational approach can be traced back to Taub (1954) (see also Schutz 1970). Our approach relies heavily on the work of Brandon Carter, his students, and collaborators (Carter 1989a; Comer and Langlois 1993, 1994; Carter and Langlois 1995b, 1998; Langlois et al. 1998; Prix 2000, 2004). This strategy is attractive as it makes maximal use of the tools of the trade of relativistic fields, i.e., no special tricks or devices will be required (unlike even the case of the “off-the-shelf” approach). Our footing is made sure by well-grounded, action-based arguments. As Carter has made clear: When there are multiple fluids, of both the charged and uncharged variety, it is essential to distinguish the fluid momenta from the velocities, in order to make the geometrical and physical content of the equations transparent. A well-posed action is, of course, perfect for systematically constructing the momenta.

Specifically, we will make use of a “pull-back” approach (see, e.g., Comer and Langlois 1993, 1994; Comer 2002) to construct a Lagrangian displacement of the particle number density flux \(n^a\), whose magnitude n is the particle number density. This will form the basis for the variations of the fundamental fluid variables in the action principle.

6.1 The action principle

It is useful to begin by explaining why we need to develop a constrained action principle. The argument is quite simple. Consider a single matter component, represented by a flux \(n^a\). For an isotropic system the matter Lagrangian, which we will call \(\varLambda \) (taking over the role of L from Sect. 4), should be a relativistic invariant and hence depend only on \(n^2 = -g_{ab} n^a n^b\). In effect, this means that it depends on both the flux and the spacetime metric. This is, of course, important as the dependence on the metric leads to the stress-energy tensor (again, as in Sect. 4). An arbitrary variation of \(\varLambda =\varLambda (n^2)=\varLambda (n^a,g_{ab})\) now leads to (ignoring terms that can be written as total derivatives representing “surface terms”, as in the point-particle discussion)

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right) = \sqrt{- g} \left[ \ \mu _a \delta n^a + \frac{1}{2} \left( \varLambda g^{a b} + n^a \mu ^b\right) \delta g_{a b}\right] , \end{aligned}$$
(6.1)

where \(\mu _a\) is the canonical momentum, which is given by

$$\begin{aligned} \mu _{a} = {\partial \varLambda \over \partial n^a} = -2 {\partial \varLambda \over \partial n^2} g_{ab} n^b . \end{aligned}$$
(6.2)

We have also used (see Sect. 4.4)

$$\begin{aligned} \delta \sqrt{-g} = {1\over 2} g^{ab} \delta g_{ab} . \end{aligned}$$
(6.3)

Here is the problem: As it stands, Eq. (6.1) suggests that the equations of motion would simply be \(\mu _a=0\), which means that the fluid carries neither energy nor momentum. This is obviously not what we are looking for,

In order to make progress, we impose the constraint that the flux is conserved.Footnote 14 That is, we insist that

$$\begin{aligned} \nabla _ a n^a = 0 . \end{aligned}$$
(6.4)

From a strict field theory point of view, it makes sense to introduce this constraint. The conservation of the particle flux (the number density current) should not be a part of the equations of motion, but rather should be automatically satisfied when evaluated on a solution of the “true” equations.

For reasons that will become clear shortly, it is useful to rewrite the conservation law in terms of the dual three-formFootnote 15

$$\begin{aligned} n_{abc} = \epsilon _{dabc} n^d, \end{aligned}$$
(6.5)

such that

$$\begin{aligned} n^a = {1\over 3!} \epsilon ^{bcda} n_{bcd} . \end{aligned}$$
(6.6)

It also follows that

$$\begin{aligned} n^2 = - g_{ab} n^a n^b = {1\over 3!} n_{abc}n^{abc} , \end{aligned}$$
(6.7)

which shows that \(n_{abc}\) acts as a volume measure which allows us to “count” the number of fluid elements. In Fig. 9 we have seen that a two-form is associated with worldtubes. A three-form is the next higher-ranked object and it can be thought of, in an analogous way, as leading to boxes (Misner et al. 1973). This is quite intuitive, and we will comment on it again later.

figure i

With this set-up, the conservation of the matter flux is ensured provided that the three-form \(n_{abc}\) is closed. It is easy to see that

$$\begin{aligned} \partial _{[a} n_{bcd]}=\nabla _{[a} n_{bcd]} = 0\quad \Longrightarrow \quad \nabla _{a} n^{a} = 0 . \end{aligned}$$
(6.8)
Fig. 10
figure 10

The pull-back from “fluid-particle” points in the three-dimensional matter space, labelled by the coordinates \(\{X^1,X^2,X^3\}\), to fluid-element worldlines in spacetime. Here, the pull-back of the “\(I^{ th }\)” (\(I = 1,2,\dots ,n\)) fluid-particle to, say, an initial point on a worldline in spacetime can be taken as \(X^A_I = X^A(0,x^i_I)\) where \(x^i_I\) is the spatial position of the intersection of the worldline with the \(t = 0\) time slice

The main reason for introducing the dual is that it is straightforward to construct a particle number density three-form that is automatically closed. We achieve this by introducing a three-dimensional “matter” space—the left-hand part of Fig. 10—which is labelled by coordinates \(X^A\), where \(A,B,C, \ldots = 1,2,3\). For each time slice in spacetime, we have the same configuration in the matter space. That is, as time moves forward, the fluid particle positions in the matter space remain fixed—even through the worldlines weave through spacetime. In this sense we are “pulling back” from the matter space to spacetime (cf. the discussion of the Lie derivative). The \(n_{abc}\) three-form can then be “pushed forward” to the three-dimensional matter space by using the map associated with the coordinates \(X^A\) (which represent scalar fields on spacetime):

$$\begin{aligned} \psi ^A_a = \partial _a X^A . \end{aligned}$$
(6.9)

This construction leads to a matter-space three form \(N_{ABC}\),

$$\begin{aligned} n_{a b c} = \psi _a^A \psi _b^B \psi ^C_c N_{A B C} , \end{aligned}$$
(6.10)

which is completely anti-symmetric in its indices. The final step involves noting that

$$\begin{aligned} \partial _{[a}n_{bcd]} = \psi ^A_a \psi ^B_b \psi ^C_c\psi ^D_d \partial _{[A} n_{BCD]} = 0 , \end{aligned}$$
(6.11)

is automatically satisfied if

$$\begin{aligned} \partial _{[A} n_{BCD]} = 0 , \end{aligned}$$
(6.12)

which, in turn, follows if \(n_{ABC}\) is taken to be a function only of the \(X^A\) coordinates. This completes the argument.

Now we need to connect this idea to the variational principle. The key step involves introducing the Lagrangian displacement \(\xi ^a\), tracking the motion of a given fluid element. From the standard definition of Lagrangian variations, we have

$$\begin{aligned} \varDelta X^A = \delta X^A + \mathcal {L}_{\xi } X^A = 0 , \end{aligned}$$
(6.13)

where \(\delta X^A\) is the Eulerian variation and \(\mathcal {L}_{\xi }\) is the Lie derivative along \(\xi ^a\). This means that we have

$$\begin{aligned} \delta X^A = - \mathcal {L}_{\xi } X^A = - \xi ^a {\partial X^A \over \partial x^a} = -\xi ^a \psi ^A_a. \end{aligned}$$
(6.14)

It also follows that

$$\begin{aligned} \varDelta \psi ^A_a= & {} \delta \psi ^A_a + \xi ^b \partial _b \psi ^A_a + \psi ^A_b \partial _a \xi ^b = \partial _a \delta X^A + \xi ^b \partial _b \psi ^A_a + \psi ^A_b \partial _a \xi ^b \nonumber \\= & {} \partial _a \left( \varDelta X^a - \xi ^b \partial _b X^a \right) + \xi ^b \partial _b \psi ^A_a + \psi ^A_b \partial _a \xi ^b= 0 , \end{aligned}$$
(6.15)

since partial derivatives commute. Given these results, it is easy to show that

$$\begin{aligned} \varDelta n_{abc} = \psi ^A_a \psi ^B_b\psi ^C_c \partial _D N_{ABC} \varDelta X^D = 0 . \end{aligned}$$
(6.16)

This implies that

$$\begin{aligned} \delta n_{abc} = - \mathcal {L}_\xi n_{abc} , \end{aligned}$$
(6.17)

and hence

$$\begin{aligned} \delta n^a = {1\over 3!} \delta \left( \epsilon ^{bcda} n_{bcd} \right) = {1\over 3!} \left( \delta \epsilon ^{bcda} n_{bcd} - \epsilon ^{bcda} \mathcal {L}_\xi n_{bcd} \right) . \end{aligned}$$
(6.18)

Making use of a little bit of elbow grease and the standard relations

$$\begin{aligned} \delta g_{db} = - g_{da} g_{bc} \delta g^{ac} , \end{aligned}$$
(6.19)

and

$$\begin{aligned} \delta \epsilon ^{abcd} = {1\over 2} \epsilon ^{abcd} g_{ef} \delta g^{ef} , \end{aligned}$$
(6.20)

we arrive at

$$\begin{aligned} \delta n^a= & {} {1\over 3!} \delta ( \epsilon ^{ b c d a} n_{b c d} ) = n^b \nabla _b \xi ^a - \xi ^b \nabla _b n^a - n^a \left( \nabla _b \xi ^b - \frac{1}{2} g_{b c} \delta g^{b c}\right) \nonumber \\= & {} - \mathcal {L}_\xi n^a - n^a \left( \nabla _b \xi ^b - \frac{1}{2} g_{b c} \delta g^{b c}\right) , \end{aligned}$$
(6.21)

or

$$\begin{aligned} \varDelta n^a = - n^a \left( \nabla _b \xi ^b + { 1 \over 2} g^{bd} \delta g_{bd} \right) = - {1 \over 2} n^a \left( g^{bd} \varDelta g_{bd}\right) , \end{aligned}$$
(6.22)

where

$$\begin{aligned} \varDelta g_{ab} = \delta g_{a b} + 2 \nabla _{(a} \xi _{b)} , \end{aligned}$$
(6.23)

(the parentheses indicate symmetrization, as usual). Equation (6.22) has a natural interpretation: The variation of a fluid worldline with respect to its own Lagrangian displacement has to be along the worldline and can only measure the changes of the volume of its own fluid element. This is one of the advantages of the Lagrangian variation approach.

figure j

Expressing the variations of the matter Lagrangian in terms of the displacement \(\xi ^a\), rather than the perturbed flux, we ensure that the flux conservation is accounted for in the equations of motion. The variation of \(\varLambda \) now leads to

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right)= & {} \sqrt{- g} \left\{ f_a \xi ^a - \frac{1}{2}\left[ \left( \varLambda - n^c\mu _c\right) g_{a b} + n_a \mu _b \right] \delta g^{a b} \right\} \nonumber \\&+ \nabla _a \left( \frac{1}{2} \sqrt{- g} \mu ^{abc} n_{bcd} \xi ^d\right) , \end{aligned}$$
(6.27)

and the fluid equations of motion are given by

$$\begin{aligned} f_b \equiv 2 n^a \nabla _{[a} \mu _{b]} = 0 , \end{aligned}$$
(6.28)

(where the square brackets indicate anti-symmetrization, as usual). Finally, introducing the vorticity two-form

$$\begin{aligned} \omega _{ab} = 2\nabla _{[a} \mu _{b]} , \end{aligned}$$
(6.29)

we have the simple relation

$$\begin{aligned} n^a \omega _{ab} = 0 , \end{aligned}$$
(6.30)

which should be familiar (see Sect. 5.2).

We can also read off the stress-energy tensor from (6.27). We need (see Sect. 4)

$$\begin{aligned} T_{ab} = - {2 \over \sqrt{-g}} {\delta \left( \sqrt{-g}\varLambda \right) \over \delta g^{ab}} = \varLambda g_{ab} - 2 {\delta \varLambda \over \delta g^{ab}} . \end{aligned}$$
(6.31)

Finally, introducing the matter four-velocity, such that \(n^a=nu^a\) and \(\mu _a = \mu u_a\), where \(\mu \) is the chemical potential (as before), we see that the energy is

$$\begin{aligned} \varepsilon = u_a u_b T^{ab} = - \varLambda . \end{aligned}$$
(6.32)

Moreover, we identify the pressure from the thermodynamical relation:

$$\begin{aligned} p = -\varepsilon + n\mu = \varLambda - n^c\mu _c . \end{aligned}$$
(6.33)

This means that we have

$$\begin{aligned} T^{ab} = pg^{ab} + n^a \mu ^b = \varepsilon u^a u^b + p \perp ^{ab} , \end{aligned}$$
(6.34)

and it is straightforward to confirm that

$$\begin{aligned} \nabla _a T^{ab} = - f^b + \nabla ^b \varLambda - \mu ^b \nabla _a n^a = - f^b = 0 , \end{aligned}$$
(6.35)

given that (i) \(\varLambda \) is a function only of \(n^a\) and \(g_{ab}\), and (ii) the definition of the momentum \(\mu _a\).

figure k

6.2 Lagrangian perturbations

Later, we will consider linear dynamics of different systems—both at the local level and for macroscopic bodies like rotating stars. This inevitably draws on an understanding of perturbation theory, which (in turn) makes contact with the variational argument we have just completed. Given this, it is worth making a few additional remarks before we move on.

First of all, an unconstrained variation of \(\varLambda (n^2)\) is with respect to \(n^a\) and the metric \(g_{a b}\), and allows the four components of \(n^a\) to be varied independently. It takes the form

$$\begin{aligned} \delta \varLambda = \mu _a \delta n^a + \frac{1}{2} n^a \mu ^b \delta g_{a b} , \end{aligned}$$
(6.36)

where

$$\begin{aligned} \mu _a = \mathcal {B}n_a , \qquad \mathcal {B}\equiv - 2 \frac{\partial \varLambda }{\partial n^2}. \end{aligned}$$
(6.37)

The use of the letter \(\mathcal {B}\) is to remind us that this is a bulk fluid effect, which is present regardless of the number of fluids and constituents. The momentum covector \(\mu _a\) is (as we have seen) dynamically, and thermodynamically, conjugate to \(n^a\), and its magnitude is the chemical potential of the particles (recalling that \(\varLambda = - \varepsilon \)).

Next, by introducing the displacement \(\xi ^a\), effectively tracking the fluid elements, we have prepared the ground for a study of general Lagrangian perturbations (for example, those used in relativistic studies of neutron-star instabilities (Friedman 1978), see Sect. 7.4). In fact, given the results from the variational derivation it is straightforward to write down the perturbed fluid equations.

By introducing the decomposition \(n^a = n u^a\) we can show that the argument that led to (6.22) also providesFootnote 16

$$\begin{aligned} \delta n = - \nabla _a \left( n \xi ^a \right) - n \left( u_a u^b \nabla _b \xi ^a + \frac{1}{2} \perp ^{ab} \delta g_{a b}\right) , \end{aligned}$$
(6.38)

and

$$\begin{aligned} \delta u^a = \left( \delta ^a{}_b + u^a u_b \right) \left( u^c \nabla _c \xi ^b - \xi ^c \nabla _c u^b \right) + \frac{1}{2} u^a u^b u^c \delta g_{b c}. \end{aligned}$$
(6.39)

Similar arguments lead to

$$\begin{aligned} \varDelta u^a= & {} \frac{1}{2} u^a u^b u^c \varDelta g_{b c}, \end{aligned}$$
(6.40)
$$\begin{aligned} \varDelta \epsilon _{a b c d}= & {} \frac{1}{2} \epsilon _{a b c d} g^{e f} \varDelta g_{e f}, \end{aligned}$$
(6.41)
$$\begin{aligned} \varDelta n= & {} - \frac{n}{2} \perp ^{ab} \varDelta g_{a b}. \end{aligned}$$
(6.42)

These results and their Newtonian analogues were used by Friedman and Schutz in establishing the so-called Chandrasekhar–Friedman–Schutz (CFS) instability (Chandrasekhar 1970; Friedman and Schutz 1978a, b) (see Sect. 7.4).

6.3 Working with the matter space

The derivation of the Euler equations (6.28) made “implicit” use of the matter space as a device to ensure the conservation of the particle flux. In many ways it makes sense to introduce the argument this way, but—as we will see when we consider elasticity—it can be useful to work more explicitly with the matter space quantities.

Let us first note that, as implied by Fig. 10, the \(X^A\) coordinates are comoving with their respective worldlines, meaning that they are independent of the proper time \(\tau \), say, that parameterizes each curve. This is easy to demonstrate. Introducing the four velocity associated with the world line through \(n^a = n u^a\), we have

$$\begin{aligned} n{dX^A \over d\tau }= & {} n {dx^a \over d\tau } \partial _a X^A = n^a \partial _a X^A = \mathcal{L}_n X^A \nonumber \\= & {} - \frac{1}{3!} \epsilon ^{b c d a} \psi ^A_a \psi ^B_b \psi ^C_c \psi ^D_d N_{B C D} = 0. \end{aligned}$$
(6.43)

We see that the time part of the spacetime dependence of the \(X^A\) is somewhat ad hoc. If we take the flow of time \(t^a\) to provide the proper time of the worldlines (\(t^a\) is parallel to \(n^a\) and hence \(u^a\)), the \(X^A\) do not change. An apparent time dependence in spacetime means that \(t^a\) is such as to cut across fluid worldlines (\(t^a\) is not parallel to \(n^a\)), which of course have different values for the \(X^A\).

It is also worth noting the (closely related) fact that \(n_{abc}\) is a “fixed” tensor, in the sense that

$$\begin{aligned} u^a n_{abc} = n u^a u^d \epsilon _{dabc} = 0 , \end{aligned}$$
(6.44)

(i.e. the three-form is spatial) and

$$\begin{aligned} \mathcal {L}_u n_{abc} = 0 , \end{aligned}$$
(6.45)

(it does not change along the flow). The latter is equivalent to requiring that the three-form \(n_{abc}\) be closed; i.e.,

$$\begin{aligned} \nabla _{[a}n_{bcd]} = \partial _{[a}n_{bcd]} = 0, \end{aligned}$$
(6.46)

which, of course, holds by construction.

From a formal point of view, we have changed perspective by taking the (scalar fields) \(X^A\) to be the fundamental variables. The construction also provides matter space with a geometric structure. As a first example of this note that, if integrated over a volume in matter space, \(n_{ABC}\) provides a measure of the number of particles in that volume. To see this, simply introduce a matter space three form \(\epsilon _{ABC}\) such that

$$\begin{aligned} n_{ABC} = n\epsilon _{ABC} , \end{aligned}$$
(6.47)

and recall that such an object represents a volume. Since n is the number density, it follows immediately that \(n_{ABC}\) represents the number of particles in the volume. This object is directly linked to the spacetime version;

$$\begin{aligned} n_{abc} = n u^d \epsilon _{dabc} \equiv n \epsilon _{abc} \end{aligned}$$
(6.48)

where \(\epsilon _{abc}\) is associated with a right-handed tetrad moving along \(u^a\). It then follows immediately that

$$\begin{aligned} \epsilon _{abc} = \psi ^A_a \psi ^B_b \psi ^C_c \epsilon _{ABC} . \end{aligned}$$
(6.49)

Inspired by this, we may also introduce

$$\begin{aligned} g^{A B} = \psi ^A_a \psi ^B_b g^{a b} = \psi ^A_a \psi ^B_b \perp ^{a b} , \end{aligned}$$
(6.50)

representing the induced metric on matter space.

Equipped with these matter space quantities, it is fairly natural to ask; is it possible to express the Lagrangian \(\varLambda (n^2)\) in terms of matter space quantities? The answer will soon be relevant, so let us consider it now. It is straightforward to show that we may consider \(\varLambda \) to be a function of \(g^{A B}\) and \(n_{ABC}\):

$$\begin{aligned} n^2= & {} - g_{ab} n^a n^b = {1\over 3!} n_{abc} n^{abc} \nonumber \\= & {} {1\over 3!} \left( \psi ^A_a g^{ad} \psi ^D_d\right) \left( \psi ^B_b g^{be} \psi ^E_e\right) \left( \psi ^C_c g^{cf} \psi ^F_f\right) n_{ABC} n_{DEF} \nonumber \\= & {} {1\over 3!} g^{AD} g^{BE} g^{CF} n_{ABC} n_{DEF} . \end{aligned}$$
(6.51)

It follows that, if we introduce

$$\begin{aligned} \gamma _{AB} = \left( \sqrt{\det \left( g_{GH}\right) } n\right) ^{2/3} g_{AB} , \end{aligned}$$
(6.52)

then (using Eq. (B.8) from Appendix 2)

$$\begin{aligned} n^2 = \frac{1}{3!} \gamma ^{AD} \gamma ^{BE} \gamma ^{CF} [ABC] [DEF] = \det \left( \gamma ^{AB}\right) . \end{aligned}$$
(6.53)

and

$$\begin{aligned} \varLambda (n^2)\quad \Leftrightarrow \quad \varLambda \left( \mathrm {det} \left( \gamma ^{AB}\right) \right) . \end{aligned}$$
(6.54)

Finally, it is worth noting that, alongside the number three-form we may introduce the analogous object for the momentum:

$$\begin{aligned} \mu ^{abc} = \epsilon ^{dabc} \mu _d , \quad \mu _a = {1\over 3!} \epsilon _{bcda} \mu ^{bcd} . \end{aligned}$$
(6.55)

This then leads to

$$\begin{aligned} n\mu = -n^a \mu _a = n_{abc}\mu ^{abc} = n_{ABC} \mu ^{ABC} , \end{aligned}$$
(6.56)

where

$$\begin{aligned} \mu ^{ABC} = \psi ^A_a\psi ^B_b\psi ^C_c \mu ^{abc} . \end{aligned}$$
(6.57)

6.4 A step towards field theory

The quantities we introduced in the previous section may seem somewhat abstract at this point, but their meaning will (hopefully) become clearer later. As a first exercise in working with them, let us ask what happens if we consider the matter space “fields” as the fundamental variables of the theory.

In general, we might take the Lagrangian to be \(\varLambda = \varLambda (X^A, \psi ^A_a, g^{ab})\) (as in, for example, Jezierski and Kijowski 2011). This leads to

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right) = \sqrt{- g} \left\{ {\partial \varLambda \over \partial X^A} \delta X^A+ {\partial \varLambda \over \partial \psi ^A_a} \delta \psi ^A_a + \left[ {\partial \varLambda \over \partial g^{ab} }- {\varLambda \over 2} g_{ab}\right] \delta g^{ab} \right\} . \end{aligned}$$
(6.58)

If we introduce the Lagrangian displacement, as before, we already know that

$$\begin{aligned} \varDelta X^A = 0 , \end{aligned}$$
(6.59)

and

$$\begin{aligned} \varDelta \psi ^A_a = 0 \quad \Longrightarrow \quad \delta \psi ^A_a = - \xi ^c \nabla _c \psi ^A_a - \psi ^A_c \nabla _a \xi ^c = - \nabla _a \left( \xi ^c \psi ^A_c\right) , \end{aligned}$$
(6.60)

where we have used the fact that partial derivatives commute. It then follows that

$$\begin{aligned} {\partial \varLambda \over \partial X^A} \delta X^A+ {\partial \varLambda \over \partial \psi ^A_a} \delta \psi ^A_a = -\xi ^c \psi ^A_c \left[ {\partial \varLambda \over \partial X^A} - \nabla _a \left( {\partial \varLambda \over \partial \psi ^A_a} \right) \right] , \end{aligned}$$
(6.61)

and we see that the Euler-Lagrange equations are

$$\begin{aligned} \psi ^A_c \left[ {\partial \varLambda \over \partial X^A} - \nabla _a \left( {\partial \varLambda \over \partial \psi ^A_a} \right) \right] =0 . \end{aligned}$$
(6.62)

We also see that the stress-energy tensor is

$$\begin{aligned} T_{ab} = - {2 \over \sqrt{-g}} {\delta \left( \sqrt{-g} \varLambda \right) \over \delta g^{ab} } = \varLambda g_{ab} - 2{\partial \varLambda \over \partial g^{ab} } . \end{aligned}$$
(6.63)

It is easy to see that these results lead us back to (4.46).

In order to compare the Euler–Lagrange equations for the fields to the Euler equations (5.25), we need two intermediate results. First of all,

$$\begin{aligned} {\partial \varLambda \over \partial \psi ^A_a} = \mu _b {\partial n^b \over \partial \psi ^A_a}= & {} {1\over 3!} \mu _b \epsilon ^{cdeb} n_{CDE} {\partial \over \partial \psi ^A_a} \left( \psi ^C_c \psi ^D_d \psi ^E_e \right) \nonumber \\= & {} {1\over 2} \mu _b \epsilon ^{adeb} \psi ^D_d \psi ^E_e n_{ADE} = - {1\over 2} \mu ^{ade} \psi ^D_d \psi ^E_e n_{ADE} \nonumber \\= & {} - {1\over 2} \mu ^{ade} \delta _A^B \psi ^D_d \psi ^E_e n_{BDE} = - {1\over 2} \mu ^{ade} \left( \psi _A^b \psi ^B_b \right) \psi ^D_d \psi ^E_e n_{BDE} \nonumber \\= & {} - {1\over 2} \psi _A^b\mu ^{ade} n_{bde} = \psi ^b_A\left[ \delta ^a_b \left( \mu _c n^c\right) - \mu _b n^a\right] . \end{aligned}$$
(6.64)

This is true because (i) the metric is held fixed in the partial derivative, and (ii) \(n_{ABC}\) depends only on the matter space coordinates \(X^A\). We then see that

$$\begin{aligned} \psi ^A_b {\partial \varLambda \over \partial \psi ^A_a} = \perp _b^c \left[ \delta ^a_c \left( \mu _d n^d\right) - \mu _c n^a\right] = - n\mu \perp _b^a , \end{aligned}$$
(6.65)

since \(n^a=nu^a\), \(\mu _a= \mu u_a\) and \(\perp ^c_b u_c=0\). Secondly, we need

$$\begin{aligned} \psi ^A_c {\partial \varLambda \over \partial X^A} = \nabla _c \varLambda - {\partial \varLambda \over \partial \psi ^A_b} \nabla _c \psi ^A_b , \end{aligned}$$
(6.66)

Making use of these results, we get

$$\begin{aligned} \psi ^A_b \left[ {\partial \varLambda \over \partial X^A} - \nabla _a \left( {\partial \varLambda \over \partial \psi ^A_a} \right) \right]= & {} \nabla _a \left[ \delta ^a_b \varLambda - \psi ^A_b {\partial \varLambda \over \partial \psi ^A_a} \right] = \nabla _a \left[ \delta ^a_b \varLambda + n\mu \perp ^a_b\right] \nonumber \\= & {} \nabla _a \left[ \delta ^a_b (\varLambda -n^c \mu _c) + n^a \mu _b\right] = \nabla _aT^a_{\ b} = 0 . \end{aligned}$$
(6.67)

In essence, the two descriptions are consistent—as they had to be.

What we have outlined is a field-theory approach to the problem, based on the idea that the matter space variables can be viewed as fields in spacetime (Endlich et al. 2011). It is, of course, not a truly independent variational approach, and (as we have seen) the equations of motion one obtains need to be massaged into a more intuitive form. However, this does not mean that the argument is without merit. Looking at a problem from different perspectives tends to help understanding. In this particular instance, we may explore the connection between the symmetries of the problem and the matter space variables. By changing the focus from the familiar macroscopic fluid degrees of freedom to three scalar functions \(X^A\) it is easy to keep track of the expected Poincaré invariance. First of all, if we expect the system to be homogeneous and isotropic we have to require the fields to be invariant under internal translations and rotations. This means that

$$\begin{aligned} X^A \rightarrow X^A + a^A , \end{aligned}$$
(6.68)

for constant \(a^A\), and

$$\begin{aligned} X^A \rightarrow O^A_{\ B} X^B , \end{aligned}$$
(6.69)

where \(O^A_{\ B}\) is an SO(3) matrix (associated with rotation). These conditions do not restrict us to fluids, however, as they will also hold for isotropic solids. The final condition we need relates to invariance under volume-preserving diffeomorphisms, leading to

$$\begin{aligned} X^A \rightarrow \xi ^A(X^B) , \ \text{ with } \ \mathrm {det} {\partial \xi ^A \over \partial X^B} = 1 . \end{aligned}$$
(6.70)

In practice, this corresponds to the dynamics being invariant as the fluid elements move around without expansion or contraction.

What are the implications of these conditions? First of all, we need each of the \(X^A\) fields to be acted on by at least one derivative (although see Andersson et al. 2017a for a discussion on how this assumption can be relaxed for dissipative systems). This means that the Lagrangian cannot depend on \(X^A\) directly (as we assumed). Moreover, taking a field-theory view of the problem (see the discussion of the fluid-gravity correspondence in Sect. 16.4) we may focus on low momenta/low frequencies, for which the most relevant terms are those with the fewest derivatives. In effect, the lowest order Lagrangian will involve exactly one derivative acting on each \(X^A\). The focus then shifts to the map, \(\psi ^A_a\). As we expect to work with Lorentz scalars, it would be natural to assume that the Lagrangian must involve the contraction

$$\begin{aligned} g^{AB}= g^{ab} \psi ^A_a \psi ^B_b , \end{aligned}$$
(6.71)

from before (i.e., the induced metric on the matter space). Moreover, we have already seen that the symmetries require us to work with invariant functions of \(g^{AB}\) and the volume preserving argument picks out the determinant as the key combination.

The connection with quantum field theory is explored by Endlich et al. (2011), with particularly interesting developments relating to symmetry breaking and the emergence of superfluidity (Dubovsky et al. 2006, 2012) and extensions to incorporate quantum anomaliesFootnote 17 in the field theory (Dubovsky et al. 2014). And example of the latter is the Wess–Zumino anomaly, which leads to terms that remain only after integration by parts. In effect, the action is invariant, but the Lagrangian is not. Somewhat simplistically, one may associate such terms with the surface terms we neglected in the variational argument. There has also been some effort to extend the approach to dissipative systems (Endlich et al. 2013).

7 Newtonian limit and Lagrangian perturbations

7.1 The Newtonian limit

Having written down the equations that govern a single (barotropic) relativistic fluid, it is natural to consider the connection between the final expressions and standard Newtonian fluid dynamics. In order to make this connection, we need to establish how one arrives at the Newtonian limit of the relativistic equations. It is useful to work this out because—even though the framework we are developing is intended to describe relativistic systems—modelling often draws on intuition gained from good old Newtonian physics. This is especially the case when one considers “new” applications. Useful qualitative understanding can often be obtained from a Newtonian analysis, but we need relativistic models for precision and in order to explore unique aspects, like rotational frame-dragging and gravitational radiation.

There has been much progress on the analysis of Newtonian multifluid systems. Prix (2004) has developed an action-based formalism, analogous to the model we consider here (based on the notion of time-shifts, closely related to the Lagrangian variations in spacetime). Carter and Chamel (2004, 2005a, 2005b) have done the same, except that they use a fully spacetime covariant formalism (taking the work of Milne and Cartan as starting points), taking full account of the fact that the Newtonian limit is singular. Our aim here is less ambitious. We simply want to demonstrate how the Newtonian fluid equations can be extracted as the non-relativistic limit of the relativistic model.

We take as the starting point the leading order line element in the weak-field limit;

$$\begin{aligned} {d} s^2 = - c^2 d\tau ^2 = - c^2 \left( 1 + \frac{2 \varPhi }{c^2}\right) {d} t^2 + \eta _{i j} { d} x^i { d} x^j , \end{aligned}$$
(7.1)

where \(x^i\ (i=1-3)\) are Cartesian coordinates, \(\eta _{i j}\) is the flat three-dimensional metric and \(\varPhi \) is the gravitational potential. The Newtonian limit then follows by writing the equations to leading order in an expansion in powers of the speed of light c. Formally, the Newtonian results are obtained in the limit where \(c \rightarrow \infty \).

Let us apply this strategy to the equations of fluid dynamics. With \(\tau \) the proper time measured along a fluid element’s worldline, the curve it traces out can be written

$$\begin{aligned} x^{a}(\tau ) = \{c t(\tau ),x^i(\tau )\} . \end{aligned}$$
(7.2)

In order to work out the four-velocity,

$$\begin{aligned} u^{a} = \frac{{ d} x^a}{{ d} \tau } , \end{aligned}$$
(7.3)

we note that (7.1) leads to

$$\begin{aligned} { d} \tau ^2 = \left( 1 + \frac{2 \varPhi }{c^2} - \frac{\eta _{ij} v^i v^j}{c^2}\right) {d} t^2 , \end{aligned}$$
(7.4)

with \(v^i = { d}x^i/{d}t\) the Newtonian three-velocity of the fluid. Since the velocity is assumed to be small, in the sense that

$$\begin{aligned} {\left| v^i\right| \over c} \ll 1 , \end{aligned}$$
(7.5)

this leads to

$$\begin{aligned} {dt\over d\tau } \approx 1- {\varPhi \over c^2} + {v^2 \over 2 c^2} , \end{aligned}$$
(7.6)

where \(v^2 = \eta _{i j} v^i v^j\), and

$$\begin{aligned} u^0 = {dx^0 \over d\tau } = c {dt \over d\tau } \approx c\left( 1- {\varPhi \over c^2} + {v^2 \over 2 c^2} \right) . \end{aligned}$$
(7.7)

It is also easy to see that

$$\begin{aligned} u^i = {d x^i \over d\tau } = v^i {dt \over d\tau } \approx v^i . \end{aligned}$$
(7.8)

In order to obtain the covariant components, we use the metric (which is manifestly diagonal). Thus, we find that

$$\begin{aligned} u_0 = g_{00}u^0 = - c \left( 1 + \frac{2 \varPhi }{c^2}\right) \left( 1- {\varPhi \over c^2} + {v^2 \over 2 c^2} \right) \approx -c \left( 1+ {\varPhi \over c^2} + {v^2 \over 2 c^2} \right) , \end{aligned}$$
(7.9)

and

$$\begin{aligned} u_i = v_i . \end{aligned}$$
(7.10)

Note that these relations lead to

$$\begin{aligned} u^a u_a = -c^2 \left( 1- {\varPhi \over c^2} + {v^2 \over 2 c^2} \right) \left( 1+ {\varPhi \over c^2} + {v^2 \over 2 c^2}\right) + v^2 \approx - c^2 , \end{aligned}$$
(7.11)

as expected.

We can now work out the Newtonian limit for the conserved particle flux

$$\begin{aligned} \nabla _a ( n u^a)= & {} 0 \quad \Longrightarrow \quad {1\over c} \partial _t \left( n u^0 \right) + \nabla _i \left( nv^i \right) = 0 \nonumber \\&\Longrightarrow&\quad \partial _t n + \nabla _i \left( nv^i \right) = \mathcal {O}\left( c^{-1}\right) \end{aligned}$$
(7.12)

To leading order we retain the expected result

$$\begin{aligned} \partial _t n + \nabla _i \left( nv^i \right) = 0 , \end{aligned}$$
(7.13)

recovering the usual continuity equation by introducing the mass density \(\rho = mn\), with m the mass per particle.

In order to work out the corresponding limit of the Euler equations, we need the curvature contributions to the covariant derivative. However, from the definition (3.35) and the weak-field metric, we see that only \(g_{00}\) gives a non-vanishing contribution. Moreover, it is clear that

$$\begin{aligned} \varGamma ^a_{b c} = \mathcal {O}(1/c^2) , \end{aligned}$$
(7.14)

which is why we did not need to worry about this in the case of the flux conservation. The curvature contributes at higher orders.

Explicitly, we have

$$\begin{aligned} u^a \nabla _a u^b = u^a \partial _a u^b + \varGamma ^b_{ca} u^a u^c = {1\over c} u^0 \partial _t u^b + u^i \partial _i u^b + \varGamma ^b_{ca} u^a u^c . \end{aligned}$$
(7.15)

We only need the spatial components, so we set \(b=j\) to get

$$\begin{aligned} u^a \nabla _a u^j= & {} {1\over c} u^0 \partial _t u^j + u^i \partial _i u^j + \varGamma ^j_{ca} u^a u^c \nonumber \\= & {} \partial _t v^j + v^i \partial _i v^j + c^2 \varGamma ^j_{00} + \text{ higher } \text{ order } \text{ terms } \nonumber \\= & {} \partial _t v^j + v^i \partial _i v^j + {1\over 2} \eta ^{jk} \partial _k \left( {2\varPhi \over c^2} \right) \nonumber \\= & {} \partial _t v^j + v^i \partial _i v^j + \eta ^{jk} \partial _k \varPhi . \end{aligned}$$
(7.16)

Finally, we need the pressure contribution. For this we note that the projection becomes

$$\begin{aligned} \perp ^{ab} = g^{ab} + {1\over c^2} u^a u^b , \end{aligned}$$
(7.17)

in order to be dimensionally consistent. We also need \( \varepsilon \gg p \). This means that we have

$$\begin{aligned} \perp ^{ba} \nabla _a p \quad \Longrightarrow \quad \eta ^{jk} \partial _k p , \end{aligned}$$
(7.18)

and we (finally) arrive at the Euler equations

$$\begin{aligned} \partial _t v^j + v^i \partial _i v^j = - \eta ^{jk} \left( { 1\over \rho } \partial _k p + \partial _k \varPhi \right) , \end{aligned}$$
(7.19)

which represent momentum conservation.

7.2 Local dynamics

In principle, the fluid equations (from Sect. 5.2 or above) completely specify the problem for a single-component barotropic flow (once an equation of state has been provided, of course). In general, the problem is nonlinear and difficult to solve analytically. Once we couple the fluid motion to the dynamic spacetime of the Einstein equations, it becomes exceedingly so. However, if we want to understand the behaviour of a given system we can make progress using linearized theory. This approach would be suitable whenever the dynamics only deviates slightly from a known background/equilibrium state. The deviations should be small enough that we can neglect nonlinearities. This is a very common strategy, for example, to study the oscillations of neutron stars. Moreover, it is a good strategy if we want to explore the local dynamics of a given system.

Consider the case where the length and time scales of the deviations are such that the spacetime curvature can be ignored; then, we can work in the local inertial frame associated with the flow—i.e. use Minkowski coordinates \(x^a = [t,x^i]\) and assume that the spacetime curvature is flat. Letting \(\tau \) be the proper time associated with a given fluid worldline, we see from Eqs. (7.2) and (7.3) and the normalization of the four-velocity \(u^a\) (i.e. \(u^a u_a = -1\)) that—in the local inertial frame—the particle flux density takes the form

$$\begin{aligned} n^a = n u^a = n \left( 1 - v^2\right) ^{- 1/2} [1,v^i] , \end{aligned}$$
(7.20)

where \(v^i = dx^i/dt\) is the local three-velocity and \(v^2 = \eta _{i j} v^i v^j\). In the linearized case, the three-velocity \(v^i\) is small and therefore a deviation. The background four-velocity is thus uniform, taking the form \(u^a = [1,0,0,0]\), and it is obviously the case that \(\nabla _b u^a = 0\). As long as the associated scales of the deviations are sufficiently small, we should be able to take the background particle number density n to be uniform both temporally and spatially so that \(\nabla _a n = 0\). Therefore, it is easy to see that the background/equilibrium state trivially satisfies the dynamical equations.

Now consider (Eulerian) variations, such that \(n \rightarrow n + \delta n\) and \(v^i \rightarrow \delta v^i\) and let the deviations be expressed as plane waves (making use of a Fourier decomposition). The normalization of the four-velocity \(u^a\) demands that the perturbed velocity is spatial (\(u^a \delta u_a = 0\)), which is consistent with the linearization of Eq. (7.20):

$$\begin{aligned} \delta n^a = [\delta n ,n \delta v^i] . \end{aligned}$$
(7.21)

A standard sound speed derivation, however, takes the point of view that the energy density and four-velocity are the fundamental variables. For now, we adopt this approach in order to make contact with the well-known results.

From Eq. (5.12), we see a perturbation in n leads to a perturbation in \(\rho \) (recall \(\varepsilon \approx \rho = mn\) in the weak-field limit); namely,

$$\begin{aligned} \delta \rho = \mu \delta n . \end{aligned}$$
(7.22)

Likewise, Eq. (5.13) shows that there are corresponding perturbations in the pressure and chemical potential. With that in mind, we linearize Eqs. (5.15) and (5.17), and find that the perturbation problem becomes

$$\begin{aligned} \partial _t \delta \rho + \left( p + \rho \right) \nabla _i \delta v^i = 0 , \end{aligned}$$
(7.23)

and

$$\begin{aligned} \left( p + \rho \right) \partial _t \delta v_i + \nabla _i \delta p = 0 . \end{aligned}$$
(7.24)

To close the system, we introduce a barotropic equation of state:

$$\begin{aligned} p = p(\rho ) \quad \longrightarrow \quad \delta p = \left( {dp \over d\rho } \right) \delta \rho \equiv C_s^2 \delta \rho . \end{aligned}$$
(7.25)

The plane-wave Ansatz means that we have

$$\begin{aligned} \delta p= & {} A_{p} e^{i k (- \sigma t + \hat{k}_j x^j)} \end{aligned}$$
(7.26)
$$\begin{aligned} \delta \rho= & {} A_{\rho } e^{i k (- \sigma t + \hat{k}_j x^j)} \end{aligned}$$
(7.27)

and

$$\begin{aligned} \delta v^i = A^i_v e^{i k (- \sigma t + \hat{k}_j x^j)} . \end{aligned}$$
(7.28)

In these expressions, the constant \(\sigma \) is the wave-speed, the constant \(k_i\) is the (spatial) wave-vector, such that \(k^2 = k_i k^i\) (\(k^i = g^{i j} k_j\)) and \(\hat{k}_i = k_i/k\). We see from Eq. (7.25) that the pressure amplitude \(A_p\) must satisfy (assuming that the perturbations are described by the same equation of state as the background)

$$\begin{aligned} A_p = C_s^2 A_{\rho } . \end{aligned}$$
(7.29)

Inserting the plane-wave decompositions for \(\delta \rho \) and \(\delta v^i\) into (7.23) and (7.24) we find

$$\begin{aligned} \sigma A_{\rho } + (p + \rho ) \hat{k}_i A^i_v = 0 \end{aligned}$$
(7.30)

and

$$\begin{aligned} (p + \rho ) \sigma A^i_v + C^2_s A _{\rho } \hat{k}^i = 0 . \end{aligned}$$
(7.31)

It is easy to see that we cannot have non-trivial transverse waves; i.e., if \(\hat{k} _iA^i_v = 0\) then we must have \(A_{\rho } = 0\) as well. Focussing on the longitudinal case, we can contract the second equation with \(\hat{k}_i\) to obtain a scalar equation. Making use of this equation, we obtain the dispersion relation

$$\begin{aligned} \sigma ^2 - C_s^2 = 0 \quad \Longrightarrow \quad \sigma = \pm C_s . \end{aligned}$$
(7.32)

In this simple situation it is obvious that we should identify \(C_s\) as the speed of sound.

It is worth noting that we can go back to the case where the particle flux \(n^a\) is taken to be fundamental and the equation of state has the form \(\rho = \rho (n)\). If we do that, then we have

$$\begin{aligned} d\rho = \mu dn \qquad \text{ and } \qquad dp = n d\mu \end{aligned}$$
(7.33)

and it follows that the speed of sound is given by

$$\begin{aligned} C_s^2 = {dp \over d \rho } = {n \over \mu } {d \mu \over dn} . \end{aligned}$$
(7.34)

7.3 Newtonian fluid perturbations

Studies of the stability properties of rotating self-gravitating bodies are of obvious relevance to astrophysics. By improving our understanding of the relevant issues we can hope to shed light on the nature of the various dynamical and secular instabilities that may govern the spin-evolution of rotating stars. The relevance of such knowledge for neutron star astrophysics may be highly significant, especially since instabilities may lead to detectable gravitational-wave signals. In this section we will outline the Lagrangian perturbation framework developed by Friedman and Schutz (1978a, 1978b) for rotating non-relativistic stars, leading to criteria that can be used to decide when the oscillations of a rotating neutron star are unstable. We also provide an explicit example proving the instability of the so-called r-modes at all rotation rates in a perfect fluid star.

Following Friedman and Schutz (1978a, 1978b), we work with Lagrangian variations. We have already seen that the Lagrangian perturbation \(\varDelta Q\) of a quantity Q is related to the Eulerian variation \(\delta Q\) by

$$\begin{aligned} \varDelta Q = \delta Q + \mathcal {L}_\xi Q, \end{aligned}$$
(7.35)

where (as before) \(\mathcal {L}_\xi \) is the Lie derivative (introduced in Sect. 3). The Lagrangian change in the fluid velocity now follows from the Newtonian limit of Eq. (6.39):

$$\begin{aligned} \varDelta v^i = \partial _t \xi ^i, \end{aligned}$$
(7.36)

where \(\xi ^i\) is the Lagrangian displacement. Given this, and

$$\begin{aligned} \varDelta g_{ij} = \nabla _i \xi _j + \nabla _j \xi _i, \end{aligned}$$
(7.37)

where \(g_{ij}\) is the flat three-dimensional metric, we have

$$\begin{aligned} \varDelta v_i = \partial _t \xi _i + v^j\nabla _i \xi _j + v^j \nabla _j \xi _i. \end{aligned}$$
(7.38)

Let us consider the simplest case, namely a barotropic ordinary fluid for which \(\varepsilon =\varepsilon (n)\). Then we want to perturb the continuity and Euler equations. The conservation of mass for the perturbations follows immediately from the Newtonian limits of Eqs. (6.38) and (6.40) (which as we recall automatically satisfy the continuity equation):

$$\begin{aligned} \varDelta n = - n \nabla _i \xi ^i, \qquad \delta n = - \nabla _i (n \xi ^i). \end{aligned}$$
(7.39)

Consequently, the perturbed gravitational potential follows from

$$\begin{aligned} \nabla ^2 \delta \varPhi = 4\pi G \delta \rho = 4 \pi G m \, \delta n = - 4\pi G m \nabla _i(n \xi ^i). \end{aligned}$$
(7.40)

In order to perturb the Euler equations we first rewrite Eq. (7.19) as

$$\begin{aligned} (\partial _t +\mathcal {L}_v) v_i + \nabla _i \left( \tilde{\mu } + \varPhi - \frac{1}{2} v^2 \right) = 0, \end{aligned}$$
(7.41)

where \(\tilde{\mu }= \mu /m\). This form is particularly useful since the Lagrangian variation commutes with the operator \(\partial _t + \mathcal {L}_v\). Perturbing Eq. (7.41) we thus have

$$\begin{aligned} (\partial _t +\mathcal {L}_v) \varDelta v_i + \nabla _i \left( \varDelta \tilde{\mu } + \varDelta \varPhi - \frac{1}{2} \varDelta ( v^2) \right) = 0. \end{aligned}$$
(7.42)

We want to rewrite this equation in terms of the displacement vector \(\xi \). After some algebra we arrive at

$$\begin{aligned}&\partial _t^2 \xi _i + 2 v^j \nabla _j \partial _t \xi _i + (v^j \nabla _j)^2 \xi _i + \nabla _i \delta \varPhi + \xi ^j \nabla _i \nabla _j \varPhi \nonumber \\&- (\nabla _i \xi ^j) \nabla _j \tilde{\mu } + \nabla _i \varDelta \tilde{\mu } = 0. \end{aligned}$$
(7.43)

Finally, we need

$$\begin{aligned} \varDelta \tilde{\mu } = \delta \tilde{\mu } + \xi ^i\nabla _i \tilde{\mu } = \left( \frac{\partial \tilde{\mu }}{\partial n} \right) \delta n + \xi ^i\nabla _i \tilde{\mu } = - \left( \frac{\partial \tilde{\mu }}{\partial n} \right) \nabla _i (n \xi ^i) + \xi ^i\nabla _i \tilde{\mu }. \end{aligned}$$
(7.44)

Given this, we have arrived at the following form for the perturbed Euler equation:

$$\begin{aligned}&\partial _t^2 \xi _i + 2 v^j \nabla _j \partial _t \xi _i + (v^j \nabla _j)^2 \xi _i + \nabla _i \delta \varPhi + \xi ^j \nabla _i \nabla _j \left( \varPhi + \tilde{\mu } \right) \nonumber \\&- \nabla _i \left[ \left( \frac{\partial \tilde{\mu }}{\partial n} \right) \nabla _j (n \xi ^j) \right] = 0. \end{aligned}$$
(7.45)

This equation should be compared to Eq. (15) of Friedman and Schutz (1978a).

7.4 The CFS instability

Having derived the perturbed Euler equations, we are interested in constructing conserved quantities that can be used to assess the stability of the system. To do this, we first multiply Eq. (7.45) by the number density n, and then write the result (schematically) as

$$\begin{aligned} A \partial _t^2 \xi + B \partial _t \xi + C \xi = 0, \end{aligned}$$
(7.46)

omitting the indices since there is little risk of confusion. Defining the inner product

$$\begin{aligned} \left\langle \eta ^i,\xi _i \right\rangle = \int \eta ^{i*} \xi _i \, \mathrm {d} V, \end{aligned}$$
(7.47)

where \(\eta \) and \(\xi \) both solve the perturbed Euler equation, and the asterisk denotes complex conjugation (and we integrate over the volume of the body, V), one can now show that

$$\begin{aligned} \left\langle \eta , A\xi \right\rangle = \left\langle \xi ,A\eta \right\rangle ^* \qquad \mathrm {and} \qquad \left\langle \eta ,B\xi \right\rangle = - \left\langle \xi ,B\eta \right\rangle ^*. \end{aligned}$$
(7.48)

The latter requires the background relation \(\nabla _i (n v^i) = 0\), and holds as long as \(n \rightarrow 0\) at the surface of the star. A slightly more involved calculation leads to

$$\begin{aligned} \left\langle \eta , C\xi \right\rangle = \left\langle \xi , C\eta \right\rangle ^*. \end{aligned}$$
(7.49)

Inspired by the fact that the momentum conjugate to \(\xi ^i\) is \(\rho (\partial _t + v^j \nabla _j)\xi _i\), we now consider the symplectic structure

$$\begin{aligned} W(\eta ,\xi ) = \left\langle \eta , A\partial _t \xi + \frac{1}{2} B \xi \right\rangle - \left\langle A\partial _t \eta + \frac{1}{2} B \eta , \xi \right\rangle . \end{aligned}$$
(7.50)

It is straightforward to show that \(W(\eta ,\xi )\) is conserved, i.e., \(\partial _t W = 0\). This leads us to define the canonical energy of the system as (with m the baryon mass, not to be confused with the angular multipole m later)

$$\begin{aligned} E_\mathrm {c} = \frac{m}{2} W (\partial _t \xi ,\xi ) = \frac{m}{2} \left\{ \left\langle \partial _t \xi , A \partial _t \xi \right\rangle + \left\langle \xi , C \xi \right\rangle \right\} . \end{aligned}$$
(7.51)

After some manipulations, we arrive at the explicit expression:

$$\begin{aligned} E_\mathrm {c}= & {} \frac{1}{2} \int \left\{ \rho |\partial _t \xi |^2 - \rho | v^j \nabla _j \xi _i|^2 + \rho \xi ^i \xi ^{j*}\nabla _i \nabla _j (\tilde{\mu } + \varPhi ) \right. \nonumber \\&\left. + \left( \frac{\partial \mu }{\partial n} \right) |\delta n|^2 - \frac{1}{4 \pi G} |\nabla _i \delta \varPhi |^2 \right\} \mathrm {d} V , \end{aligned}$$
(7.52)

which can be compared to Eq. (45) of Friedman and Schutz (1978a). In the case of an axisymmetric system, e.g., a rotating star, we can also define a canonical angular momentum as

$$\begin{aligned} J_\mathrm {c} = - \frac{m}{2} W (\partial _\varphi \xi , \xi ) = - \mathrm {Re} \left\langle \partial _\varphi \xi , A\partial _t \xi + \frac{1}{2} B\xi \right\rangle . \end{aligned}$$
(7.53)

The proof that this quantity is conserved relies on the fact that (i) \(W(\eta , \xi )\) is conserved for any two solutions to the perturbed Euler equations, and (ii) \(\partial _\varphi \) commutes with \(\rho v^j \nabla _j\) in axisymmetry, which means that if \(\xi \) solves the Euler equations then so does \(\partial _\varphi \xi \).

As discussed in Friedman and Schutz (1978a, 1978b), the stability analysis is complicated by the presence of so-called “trivial” displacements. These trivials can be thought of as representing a relabeling of the physical fluid elements. A trivial displacement \(\zeta ^i\) leaves the physical quantities unchanged, i.e., is such that \(\delta n = \delta v^i = 0\). This means that we must have

$$\begin{aligned} \nabla _i (\rho \zeta ^i)= & {} 0, \end{aligned}$$
(7.54)
$$\begin{aligned} \left( \partial _t + \mathcal {L}_v \right) \zeta ^i= & {} 0. \end{aligned}$$
(7.55)

The solution to the first of these equations can be written

$$\begin{aligned} \rho \zeta ^i = \epsilon ^{ijk} \nabla _j \chi _k , \end{aligned}$$
(7.56)

where, in order to satisfy the second equations, the vector \(\chi _k\) must have time-dependence such that

$$\begin{aligned} ( \partial _t + \mathcal {L}_v) \chi _k = 0. \end{aligned}$$
(7.57)

This means that the trivial displacement will remain constant along the background fluid trajectories. Or, as Friedman and Schutz (1978a) put it, the “initial relabeling is carried along with the unperturbed motion”.

The trivials cause trouble because they affect the canonical energy. Before one can use the canonical energy to assess the stability of a rotating configuration one must deal with this “gauge problem”. To do this one should ensure that the displacement vector \(\xi ^i\) is orthogonal to all trivials. A prescription for this is provided by Friedman and Schutz (1978a). In particular, they show that the required canonical perturbations preserve the vorticity of the individual fluid elements. Most importantly, one can also prove that a normal mode solution is orthogonal to the trivials. Thus, mode solutions can serve as canonical initial data, and be used to assess stability.

The importance of the canonical energy stems from the fact that it can be used to test the stability of the system. In particular:

  • Dynamical instabilities are only possible for motions such that \(E_\mathrm {c}=0\). This makes intuitive sense since the amplitude of a mode for which \(E_\mathrm {c}\) vanishes can grow without bound and still obey the conservation laws.

  • If the system is coupled to radiation (e.g., gravitational waves) which carries positive energy away from the system (which should be taken to mean that \(\partial _t E_\mathrm {c} < 0\)) then any initial data for which \(E_\mathrm {c}<0\) will lead to an unstable evolution.

Consider a real frequency normal-mode solution to the perturbation equations, a solution of form \(\xi = \hat{\xi } e^{i(\omega t+m\varphi )}\). One can readily show that the associated canonical energy becomes

$$\begin{aligned} E_\mathrm {c} = \omega \left[ \omega \left\langle {\xi }, A {\xi }\right\rangle - \frac{i}{2} \left\langle {\xi }, B{\xi }\right\rangle \right] , \end{aligned}$$
(7.58)

where the expression in the bracket is real. Similarly, for the canonical angular momentum, we get

$$\begin{aligned} J_\mathrm {c} = -m \left[ \omega \left\langle {\xi }, A {\xi } \right\rangle - \frac{i}{2} \left\langle {\xi }, B{\xi } \right\rangle \right] . \end{aligned}$$
(7.59)

Combining Eqs. (7.58) and (7.59) we see that, for real frequency modes, we have

$$\begin{aligned} E_\mathrm {c} = - \frac{\omega }{m} J_\mathrm {c} = \sigma _\mathrm {p} J_\mathrm {c}, \end{aligned}$$
(7.60)

where \(\sigma _\mathrm {p}\) is the pattern speed of the mode.

Now note that Eq. (7.59) can be rewritten as

$$\begin{aligned} \frac{J_\mathrm {c}}{\left\langle \hat{\xi }, \rho \hat{\xi } \right\rangle } = - m\omega + m \frac{\left\langle {\xi }, i \rho v^j \nabla _j {\xi } \right\rangle }{\left\langle \hat{\xi }, \rho \hat{\xi } \right\rangle }. \end{aligned}$$
(7.61)

Using cylindrical coordinates, and \(v^j = \varOmega \varphi ^j \), one can show that

$$\begin{aligned} - i \rho {{\xi }}_i^* v^j \nabla _j {\xi }^i = \rho \varOmega \left[ m \left| \hat{\xi } \right| ^2 + i ({\hat{\xi }}^* \times \hat{\xi })_z \right] . \end{aligned}$$
(7.62)

But

$$\begin{aligned} \left| ({\hat{\xi }}^* \times \hat{\xi })_z \right| \le \left| \hat{\xi } \right| ^2 \end{aligned}$$
(7.63)

and hence we must have (for uniform rotation)

$$\begin{aligned} \sigma _\mathrm {p} - \varOmega \left( 1 + \frac{1}{m} \right) \le \frac{J_\mathrm {c}/m^2}{\left\langle \hat{\xi }, \rho \hat{\xi } \right\rangle } \le \sigma _\mathrm {p} - \varOmega \left( 1 - \frac{1}{m} \right) . \end{aligned}$$
(7.64)

Equation (7.64) forms a key part of the proof that rotating perfect fluid stars are generically unstable in the presence of radiation (Friedman and Schutz 1978b). The argument goes as follows: Consider modes with finite frequency in the \(\varOmega \rightarrow 0\) limit. Then Eq. (7.64) implies that co-rotating modes (with \(\sigma _\mathrm {p}>0\)) must have \(J_\mathrm {c}>0\), while counter-rotating modes (for which \(\sigma _\mathrm {p} < 0\)) will have \(J_\mathrm {c}<0\). In both cases \(E_\mathrm {c}>0\), which means that both classes of modes are stable. Now consider a small region near a point where \(\sigma _\mathrm {p}=0\) (at a finite rotation rate). Typically, this corresponds to a point where the initially counter-rotating mode becomes co-rotating. In this region \(J_\mathrm {c}<0\). However, \(E_\mathrm {c}\) will change sign at the point where \(\sigma _\mathrm {p}\) (or, equivalently, the frequency \(\omega \)) vanishes. Since the mode was stable in the non-rotating limit this change of sign indicates the onset of instability at a critical rate of rotation. The situation for the fundamental f-mode of a rotating star is illustrated in Fig. 11.

Fig. 11
figure 11

An illustration of the instabilities affecting the fundamental f-mode of a rotating neutron star. The horizontal axis represents the rotation, expressed in terms of the ratio between the kinetic energy and the gravitational potential energy (\(\beta = T/|W|\)). The angular velocity is not a (particularly) useful parameter as values beyond (something like) \(\beta \approx 0.11\) requires some degree of differential rotation. That is, rigidly rotating bodies never reach the dynamically unstable regime (at least not in Newtonian gravity). The vertical axis gives the pattern speed of the mode, with waves that appear to move forwards (according to a distant observer) having positive values, while backwards moving modes lead to negative values. The originally backwards moving f-mode becomes secularly unstable at \(\beta \approx 0.14\), at the point where the mode first appears to move forwards (because of the rotation of star). The mode becomes dynamically unstable (this is the so-called bar-mode instability) when the two modes merge at \(\beta \approx 0.24\) (adapted from Andersson 2003)

In order to further demonstrate the usefulness of the canonical energy, let us prove the instability of the inertial r-modes (these are oscillation modes that owe their existence to the rotation of the star, and which are predominantly associated with the Coriolis force). For a general inertial mode we have (cf. Lockitch and Friedman 1999 for a discussion of the single fluid problem using notation which closely resembles the one we adopt here)

$$\begin{aligned} v^i \sim \delta v^i \sim \dot{\xi }^i \sim \varOmega \qquad \mathrm {and} \qquad \delta \varPhi \sim \delta n \sim \varOmega ^2. \end{aligned}$$
(7.65)

In particular, modes like the r-modes are dominated by convective currents, so we have \(\delta v_r \sim \varOmega ^2\) and the continuity equation leads to

$$\begin{aligned} \nabla _i \delta v^i \sim \varOmega ^3 \qquad \Longrightarrow \qquad \nabla _i \xi ^i \sim \varOmega ^2. \end{aligned}$$
(7.66)

Under these assumptions we find that \(E_\mathrm {c}\) becomes (to order \(\varOmega ^2\))

$$\begin{aligned} E_\mathrm {c} \approx \frac{1}{2} \int \rho \left[ \left| \partial _t {\xi } \right| ^2 - \left| v^i \nabla _i{\xi } \right| ^2 + \xi ^{i*} \xi ^{j} \nabla _i \nabla _j \left( \varPhi + \tilde{\mu } \right) \right] \mathrm {d} V. \end{aligned}$$
(7.67)

We can rewrite the last term using the equation governing the axisymmetric equilibrium. Keeping only terms of order \(\varOmega ^2\) we have

$$\begin{aligned} \xi ^{i*} \xi ^{j} \nabla _i\nabla _j \left( \varPhi + \tilde{\mu } \right) \approx \frac{1}{2} \varOmega ^2 \xi ^{i*} \xi ^{j} \nabla _i \nabla _j (r^2 \sin ^2 \theta ). \end{aligned}$$
(7.68)

A bit more work then leads to

$$\begin{aligned} \frac{1}{2} \varOmega ^2 \xi ^{i*} \xi ^{j} \nabla _i \nabla _j (r^2 \sin ^2 \theta ) = \varOmega ^2 r^2 \left[ \cos ^2 \theta \left| \xi ^\theta \right| ^2 + \sin ^2\theta \left| \xi ^\varphi \right| ^2 \right] , \end{aligned}$$
(7.69)

and

$$\begin{aligned} \left| v^i \nabla _i \xi _j \right| ^2= & {} \varOmega ^2 \left\{ m^2 \left| \xi \right| ^2 - 2imr^2 \sin \theta \cos \theta \left[ \xi ^\theta \xi ^{\varphi *} - \xi ^\varphi \xi ^{\theta *} \right] \right. \nonumber \\&+ \left. r^2 \left[ \cos ^2 \theta \left| \xi ^\theta \right| ^2 + \sin ^2\theta \left| \xi ^\varphi \right| ^2 \right] \right\} , \end{aligned}$$
(7.70)

which means that the canonical energy can be written in the form

$$\begin{aligned}&E_\mathrm {c} \approx - \frac{1}{2} \int \rho \left\{ (m \varOmega - \omega )(m \varOmega + \omega ) |\xi |^2 \right. \nonumber \\&\quad \left. - 2 i m \varOmega ^2 r^2 \sin \theta \cos \theta \left[ \xi ^\theta \xi ^{\varphi *} - \xi ^\varphi \xi ^{\theta *} \right] \right\} \mathrm {d} V. \end{aligned}$$
(7.71)

Introducing the axial stream function U we have

$$\begin{aligned} \xi ^\theta= & {} - \frac{iU}{r^2 \sin \theta } \partial _\varphi Y_l^m e^{i \omega t}, \end{aligned}$$
(7.72)
$$\begin{aligned} \xi ^\varphi= & {} \frac{iU}{r^2 \sin \theta } \partial _\theta Y_l^m e^{i\omega t}, \end{aligned}$$
(7.73)

where \(Y_l^m=Y_l^m(\theta ,\varphi )\) are the spherical harmonics. This leads to

$$\begin{aligned} |\xi |^2 = \frac{|U|^2}{r^2} \left[ \frac{1}{\sin ^2 \theta } |\partial _\varphi Y_l^m|^2 + |\partial _\theta Y_l^m|^2 \right] , \end{aligned}$$
(7.74)

and

$$\begin{aligned}&ir^2 \sin \theta \cos \theta \left[ \xi ^\theta \xi ^{\varphi *} - \xi ^\varphi \xi ^{\theta *} \right] \nonumber \\&\quad = \frac{1}{r^2} \frac{ \cos \theta }{\sin \theta } m |U|^2 \left[ Y_l^m \partial _\theta Y_l^{m*} + Y_l^{m *} \partial _\theta Y_l^{m}\right] . \end{aligned}$$
(7.75)

After performing the angular integrals, we find that

$$\begin{aligned} E_\mathrm {c} = - \frac{ l(l+1) }{2} \left\{ (m \varOmega - \omega )(m \varOmega + \omega ) - \frac{2 m^2 \varOmega ^2}{l(l+1)} \right\} \int \rho |U|^2 \, \mathrm {d} r. \end{aligned}$$
(7.76)

Combining this with the r-mode frequency (Lockitch and Friedman 1999)

$$\begin{aligned} \omega = m \varOmega \left[ 1 - \frac{2}{l(l+1)} \right] , \end{aligned}$$
(7.77)

we see that \(E_\mathrm {c} < 0\) for all \(l>1\) r-modes, i.e., they are all unstable. The \(l=m=1\) r-mode is a special case, as it leads to \(E_\mathrm {c}=0\).

7.5 The relativistic problem

The theoretical framework for studying stellar stability in General Relativity was mainly developed during the 1970s, with key contributions from Chandrasekhar and Friedman (1972a, 1972b) and Schutz (1972a, 1972b). Their work extends the Newtonian analysis discussed above. There are basically two reasons why a relativistic analysis is more complicated than the Newtonian one. First of all, the problem is algebraically more complex because one must solve the Einstein field equations in addition to the fluid equations of motion. This is apparent from the perturbation relations we have written down already. For any given equation of state—represented by \(\varLambda (n)\)—we can express the perturbed equations of motion in terms of the displacement vector \(\xi ^a\) and the Eulerian variation of the metric, \(\delta g_{ab}\). In doing this it is worth noting that the usual approach to relativistic stellar perturbations is to work with this combination of variables (see, e.g., Kojima 1992). Essentially, we need the Eulerian perturbation of the Einstein field equations and the Lagrangian variation of the momentum equation (6.28). The description of the perturbed Einstein equations is standard (see, e.g., Andersson 2019), so we focus on the fluid aspects here.

The perturbations of (5.25) are easy to work out once we note that the Lagrangian variation commutes with the exterior derivative. We immediately get

$$\begin{aligned} (\varDelta n^a) \nabla _{[a}\mu _{b]} + n^a \nabla _{[a}\varDelta \mu _{b]} = 0 . \end{aligned}$$
(7.78)

This simplifies further if we use (6.22) and assume that the background is such that (5.25) is satisfied. The first term then vanishes, and we are left with

$$\begin{aligned} n^a \nabla _{[a}\varDelta \mu _{b]} = 0 . \end{aligned}$$
(7.79)

To complete this expression, we need to work out \(\varDelta \mu _a\). This is a straightforward task given the above results, and we find

$$\begin{aligned} \varDelta \mu _a = \left( \mathcal {B} + n {d \mathcal {B} \over dn} \right) g_{ab} \varDelta n^b + \left( \mu ^b \delta _a^d - {d \mathcal {B} \over d n^2} n_a n^b n^d \right) \varDelta g_{b d} . \end{aligned}$$
(7.80)

An additional complication is associated with the fact that one must account for gravitational waves, leading to the system being dissipative. The work culminated in a series of papers (Friedman and Schutz 1975, 1978a, b; Friedman 1978) in which the role that gravitational radiation plays in these problems was explained, and a foundation for subsequent research in this area was established. The main result was that gravitational radiation acts in the same way in the full theory as in a post-Newtonian analysis of the problem. If we consider a sequence of equilibrium models, a mode becomes secularly unstable at the point where its frequency vanishes (in the inertial frame). Most importantly, the proof does not require the completeness of the modes of the system.

8 A step towards multi-fluids

Returning to the relativistic setting, let us consider what happens if one tries to extend the off-the-shelf analysis from Sect. 5.2 to the case of two components. Take, for example, the case of a single particle species at finite temperature; a case where we have to account for the presence of entropy. In general, one would have to allow for the heat (i.e. entropy) to flow relative to the matter (see Sect. 15), but we will assume that this is not the case here. If the entropy is carried along with the matter flow, we are dealing with a single-fluid problem and we should be able to make progress with the tools we have at hand. The equation of state is, however, no longer barotropic since we have \(\varepsilon =\varepsilon (n,s)\), with n the matter number density and s the entropy density (as before). Nevertheless, the stress-energy tensor can still be expressed in terms of the pressure p and the energy density \(\varepsilon \), as in Sect. 5.2. The fluid equations obtained from its divergence will take the same form as in the barotropic case. The difference becomes apparent only when we try to close the system of equations. Now the energy variation takes the form

$$\begin{aligned} d\varepsilon = \mu dn + T ds , \end{aligned}$$
(8.1)

where the temperature is identified as the chemical potential of the entropy:

$$\begin{aligned} T = \left( {\partial \varepsilon \over \partial s} \right) _n . \end{aligned}$$
(8.2)

This means that we have

$$\begin{aligned} T^{ab} = (n\mu + sT) u^a u^b + p g^{ab} \end{aligned}$$
(8.3)

and, if we note that

$$\begin{aligned} dp = n d\mu + s dT \quad \Longrightarrow \quad \nabla _a p = n \nabla _a \mu + s \nabla _a T , \end{aligned}$$
(8.4)

it follows that energy conservation leads to

$$\begin{aligned} \mu \nabla _a n^a + T \nabla _a s^a = 0 , \end{aligned}$$
(8.5)

or

$$\begin{aligned}&\mu \left( \dot{n} + n \nabla _a u^a \right) + T \left( \dot{s} + s \nabla _a u^a \right) = 0 , \end{aligned}$$
(8.6)
$$\begin{aligned}&\dot{n} = {dn \over d\tau } = u^a \nabla _a n, \end{aligned}$$
(8.7)

and similar for \( \dot{s}\). At this point we need to make additional assumptions. If, for example, the motion is adiabatic then the entropy is conserved and the second term on the left-hand side of (8.6) vanishes. It then follows that the first bracket must vanish as well, so the matter flux is also conserved. If the flow is not adiabatic, the situation is different. Suppose there are no sources or sinks for the matter. Then the matter flux should still be conserved, but now the entropy is not. So the first term in (8.68.7) still vanishes, but the second can not. We obviously have a problem, unless we relax the assumption that the entropy flows with the matter. Introducing a heat flux relative to the matter, we avoid the issue. However, by doing so, we introduce extra degrees of freedom that need to be accounted for and understood. We will consider this problem in detail once we have extended the variational formalism to deal with additional flows. We could also consider the implication the other way; in order for a single particle flow to be adiabatic, the entropy must be carried along with the matter.

Moving on to the momentum equations arising from \(\nabla _a T^{a b}=0\), replicating the analysis from Sect. 5.2, recalling the definition \(\mu _a = \mu u_a\) and introducing the analogous quantity \(\theta _a = T u_a\), we can write (5.25) as

$$\begin{aligned} 2 n^a \nabla _{[a}\mu _{b]}+ 2 s^a \nabla _{[a}\theta _{b]}=0 \end{aligned}$$
(8.8)

That is, we arrive at a “force balance” equation with two vorticity terms instead of the single one we had before. The implication is that, even in the absence of external agents we have to consider possible interactions between the two components. By extending the variational approach we gain insight that helps address this issue (also in more complicated situations).

It is also worth highlighting that, by using notation that highlights the entropy component we have made the problem look less “symmetric” than it really is. In many situations it is practical to introduce constituent indices (labels telling us which component the quantity belongs to), e.g., use \(n_\mathrm {n}^a\) and \(n_\mathrm {s}^a\) instead of \(n^a\) and \(s^a\). Noting also that the temperature is the chemical potential associated with the entropy, i.e. \(\theta _a = \mu ^\mathrm {s}_a\), we can write the above result as

$$\begin{aligned} \sum _{{\mathrm {x}}=\mathrm {n},\mathrm {s}} f_a^{\mathrm {x}}= \sum _{{\mathrm {x}}=\mathrm {n},\mathrm {s}} 2 n_{\mathrm {x}}^b \nabla _{[b}\mu ^{\mathrm {x}}_{a]} = \sum _{{\mathrm {x}}=\mathrm {n},\mathrm {s}} 2 n_{\mathrm {x}}^b \omega ^{\mathrm {x}}_{b a} = 0 . \end{aligned}$$
(8.9)

The generalisation of this result to situations where additional components are carried along by the same four velocity is now obvious. The problem with distinct four velocities, which we turn to in Sect. 9, requires additional thinking.

8.1 The two-constituent, single fluid

Before we move on to the general problem, let us consider how the problem discussed in the previous Sect. 7.2 would be described in the variational approach. Generally speaking, the total energy density \(\varepsilon \) can be a function of independent parameters other than the particle number density \(n_\mathrm {n}\), like the entropy density \(s=n_\mathrm {s}\) in the case we just considered, assuming that the system scales in the manner discussed in Sect. 2 so that only densities need enter the equation of state.

figure l

As we have already suggested, if there is no heat flow (say) then this is a single fluid problem, meaning that there is still just one flow velocity \(u^a\). This is what we mean by a two-constituent, single fluid. We assume that the particle number and entropy are both conserved along the flow. Associated which each parameter there is then a conserved current flux, i.e. \(n^a_\mathrm {n}= n_\mathrm {n}u^a\) for the particles and \(n_\mathrm {s}^a = n_\mathrm {s}u^a\) for the entropy. Note that the ratio \(x_\mathrm {s}= n_\mathrm {s}/n_\mathrm {n}\) (the specific entropy) is co-moving in the sense that

$$\begin{aligned} u^a \nabla _a x_\mathrm {s}= \dot{x}_\mathrm {s}= 0 . \end{aligned}$$
(8.10)

This is, of course, the relation (8.68.7) from before.

Making use of the constituent indices, the associated first law can be written in the form

$$\begin{aligned} { d} \varepsilon = \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} \mu ^{\mathrm {x}}{d} n_{\mathrm {x}}= - \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} \mu ^{\mathrm {x}}_a {d} n^a_{\mathrm {x}}, \end{aligned}$$
(8.11)

since \(\varepsilon = \varepsilon (n_{\mathrm {n}},n_{\mathrm {s}})\), where

$$\begin{aligned} n^a_{\mathrm {x}}= n_{\mathrm {x}}u^a , \quad n^2_{{\mathrm {x}}} = - g_{a b} n^a_{\mathrm {x}}n^b_{\mathrm {x}}, \end{aligned}$$
(8.12)

and

$$\begin{aligned} \mu ^{\mathrm {x}}_a = g_{a b} \mathcal {B}^{{\mathrm {x}}} n^b_{\mathrm {x}}, \quad \mathcal {B}^{{\mathrm {x}}} \equiv 2 \frac{\partial \varepsilon }{\partial n^2_{{\mathrm {x}}}} . \end{aligned}$$
(8.13)

Given that we only have one four-velocity, the system will still just have one fluid element per spacetime point. But unlike before, there is an additional conserved number, \(N_\mathrm {s}\), that can be attached to each worldline, like the particle number \(N_\mathrm {n}\) of Fig. 10. In order to describe the worldlines we can use the same three scalars \(X^A(x^a)\) as before. But how do we get a construction that allows for the additional conserved number? Recall that the intersections of the worldlines with some hypersurface, say \(t = 0\), is uniquely specified by the three \(X^A(0,x^i)\) scalars. Each worldline will also have the conserved numbers \(N_\mathrm {n}\) and \(N_\mathrm {s}\) assigned to them. Thus, the values of these numbers can be expressed as functions of the \(X^A(0,x^i)\). But most importantly, the fact that each \(N_{\mathrm {x}}\) is conserved, means that this specification must hold for all of spacetime, so that the ratio \(x_\mathrm {s}\) is of the form \(x_\mathrm {s}(x^a) = x_\mathrm {s}(X^A(x^a))\). Consequently, we now have a construction where this ratio identically satisfies Eq. (8.10), and the action principle remains a variational problem in terms of the three \(X^A\) scalars.

The variation of the action follows just like before, except now a constituent index \({\mathrm {x}}\) must be attached to the particle number density current and three-form:

$$\begin{aligned} n^{\mathrm {x}}_{a b c} = \epsilon _{dabc} n^d_{\mathrm {x}}. \end{aligned}$$
(8.14)

Once again it is convenient to introduce the momentum form, now defined as

$$\begin{aligned} \mu ^{a b c}_{\mathrm {x}}= \epsilon ^{ d a b c } \mu ^{\mathrm {x}}_d . \end{aligned}$$
(8.15)

Since the \(X^A\) are the same for each \(n^{\mathrm {x}}_{a b c}\), the above discussion indicates that the pull-back construction is now to be based on

$$\begin{aligned} n^{\mathrm {x}}_{a b c} = \psi ^A_a \psi ^B_b \psi ^C_c N^{\mathrm {x}}_{A B C} , \end{aligned}$$
(8.16)

where \(N^{\mathrm {x}}_{A B C}\) is completely antisymmetric and a function only of the \(X^A\). After a little thought, it should be obvious that the only thing required here (in addition to the single-component arguments) is to attach an \({\mathrm {x}}\) index to \(n^a\) and n in Eqs. (6.21) and (6.38), respectively.

If we now define the Lagrangian to be

$$\begin{aligned} \varLambda = - \varepsilon \end{aligned}$$
(8.17)

and the generalized pressure \(\varPsi \) as

$$\begin{aligned} \varPsi = \varLambda - \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} \mu ^{\mathrm {x}}_a n^a_{\mathrm {x}}= \varLambda + \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} \mu ^{\mathrm {x}}n_{\mathrm {x}}, \end{aligned}$$
(8.18)

then the first-order variation of \(\varLambda \) is (ignoring a surface term, as usual)

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right)= & {} \frac{1}{2} \sqrt{- g} \left[ \varPsi g^{a b} + \left( \varPsi - \varLambda \right) u^a u^b \right] \delta g_{a b} \nonumber \\&- \sqrt{- g} \left( \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} f^\mathrm {x}_a\right) \xi ^a + \nabla _a \left( \frac{1}{2} \sqrt{-g} \sum _{\mathrm {x}= \mathrm {n},\mathrm {s}} \mu ^{a b c}_{\mathrm {x}}n^{\mathrm {x}}_{b c d} \xi ^d\right) ,\qquad \qquad \end{aligned}$$
(8.19)

where

$$\begin{aligned} f^{\mathrm {x}}_a = 2 n^b_{\mathrm {x}}\omega ^{\mathrm {x}}_{b a} , \end{aligned}$$
(8.20)

and

$$\begin{aligned} \omega ^{\mathrm {x}}_{a b} = \nabla _{[a} \mu ^{\mathrm {x}}_{b]} . \end{aligned}$$
(8.21)

At the end of the day, the equations of motion are

$$\begin{aligned} \sum _{{\mathrm {x}}= \mathrm {n},\mathrm {s}} f^\mathrm {x}_a = 0 , \end{aligned}$$
(8.22)

and

$$\begin{aligned} \nabla _a n^a_{\mathrm {x}}= 0 , \end{aligned}$$
(8.23)

while the stress-energy tensor takes the form

$$\begin{aligned} T^{a b} = \varPsi g^{a b} + (\varPsi - \varLambda ) u^a u^b . \end{aligned}$$
(8.24)

Not surprisingly, these results accord with the expectations from the previous analysis.

8.2 Speed of sound (again)

We have already considered the problem of wave propagation in the case of a single component (barotropic) fluid, see Sect. 7.2. Now we are equipped to revisit this problem in the more complex case of a two-constituent single-fluid—a fluid that is “stratified” either by thermal or composition gradients. As before, the analysis is local—assuming that the speed of sound is a locally defined quantity—and performed using local inertial frame (Minkowski) coordinates \(x^a = (t,x^i)\). The purpose of the analysis is twofold: The main aim is to illuminate how the presence of various constituents impacts on the local dynamics, but we also want to illustrate how the problem works out if we take the variational equations of motion as our starting point. An additional motivation is to develop notation that is flexible enough that we can deal with problems of increasing complexity, ideally without losing sight of the underlying physics.

Focussing on a small spacetime region, we can make the same argument as in Sect. 7.2 that the configuration of the matter with no waves present is locally isotropic, homogeneous, and static. Thus, for the background \(n^a_{\mathrm {x}}= [n_{\mathrm {x}},0,0,0]\) and the vorticity \(\omega ^{\mathrm {x}}_{a b}\) vanishes. The general form of the (Eulerian) variation of the force density \(f^{\mathrm {x}}_a\) for each constituent is then

$$\begin{aligned} \delta f_a^{\mathrm {x}}= 2 n^b_\mathrm {x}\partial _{[b} \delta \mu ^{\mathrm {x}}_{a]} . \end{aligned}$$
(8.25)

Similarly, the conservation of the flux \(n^a_{\mathrm {x}}\) gives

$$\begin{aligned} \partial _a \delta n^a_{\mathrm {x}}= 0 . \end{aligned}$$
(8.26)

We are now taking the view that the \(n^a_{\mathrm {x}}\) are the fundamental fluid fields and thus plane-wave propagation means that we have (the covariant analogue fo (7.28))

$$\begin{aligned} \delta n^a_\mathrm {x}= A^a_{\mathrm {x}}e^{i k_b x^b} , \end{aligned}$$
(8.27)

where the amplitudes \(A^a_{\mathrm {x}}\) and the wave vector \(k_a\) are constant. Combining Eqs. (8.26) and (8.27) we see that

$$\begin{aligned} k_a \delta n^a_{\mathrm {x}}= 0 , \end{aligned}$$
(8.28)

i.e. the waves are “transverse” in the spacetime sense. It is worth pointing out that this requirement is not in contradiction with the fact that sound waves are longitudinal (in the spatial sense), as established in Sect. 7.2. It is easy to see that (8.28) is exactly what we should expect, if we note that \(\delta n_{\mathrm {x}}^a = \delta n_\mathrm {x}u^a + n_{\mathrm {x}}\delta v^a\) and identify \(k_0 = - k \sigma \) where, recall, \(\sigma \) is the mode speed and k is the spatial part magnitude obtained from \(k^2 = k_j k^j\) (\(k^i = g^{i j} k_j\)).

Moving on to the equations of motion, as given by (8.25), we need the perturbed momentum \(\delta \mu ^{\mathrm {x}}_a\). For future reference, we will work out its general form, and only afterwards assume a static, homogeneous, and isotropic background. However, in order to establish the strategy, it is useful to start by revisiting the barotropic case. Suppose there is only one constituent, with index \({\mathrm {x}}= \mathrm {n}\). The Lagrangian \(\varLambda \) then depends only on \(n^2_{\mathrm {n}}\), and the variation in the chemical potential due to a small disturbance \(\delta n^a_\mathrm {n}\) is

$$\begin{aligned} \delta \mu ^\mathrm {n}_a = \mathcal {B}^{\mathrm {n}}_{a b} \delta n^b_\mathrm {n}, \end{aligned}$$
(8.29)

where

$$\begin{aligned} \mathcal {B}^{\mathrm {n}}_{a b} = \mathcal {B}^{\mathrm {n}} g_{a b} - 2 \frac{\partial \mathcal {B}^{\mathrm {n}}}{\partial n^2_{\mathrm {n}}} n^\mathrm {n}_a n^\mathrm {n}_b . \end{aligned}$$
(8.30)

There are two terms, simply because we need to perturb both \(\mathcal {B}^\mathrm {n}\) and \(n_\mathrm {n}^a\) in (8.13).

The single-component equation of motion is \(\delta f^\mathrm {n}_a = 0\). It is not difficult to show, by using the condition of transverse wave propagation, Eq. (8.28), and contracting with the spatial part of the wave vector \(k^i\) (the time part is trivial because (8.25) is orthogonal to \(n_\mathrm {n}^a\) which in turn is aligned with \(u^a\)), that the equation of motion reduces to

$$\begin{aligned} \left( \mathcal {B}^{\mathrm {n}} + \mathcal {B}^{\mathrm {n}}_{0 0} \frac{k_j k^j}{k^2_0}\right) k_i \delta n^i_\mathrm {n}= 0 . \end{aligned}$$
(8.31)

From this we see that the dispersion relation takes the form

$$\begin{aligned} \sigma ^2 = {k_0^2 \over k_jk^j} = - {\mathcal {B}^{\mathrm {n}}_{0 0}\over \mathcal {B}^\mathrm {n}} = 1 + 2 { n_\mathrm {n}^2 \over \mathcal {B}^\mathrm {n}} \frac{d \mathcal {B}^{\mathrm {n}}}{d n^2_{\mathrm {n}}} = 1 + \frac{d \ln \mathcal {B}^{\mathrm {n}}}{d \ln n_\mathrm {n}} . \end{aligned}$$
(8.32)

We have used the fact that we are working in a locally flat spacetime, so that \(g_{a b} = \eta _{a b}\). If we have done this right, then we should recover the expression for the speed of sound \(C_s^2\) from before, cf. Eq. (7.34). To see that this is the case, recall that \(\mu _\mathrm {n}= n_\mathrm {n}\mathcal {B}^\mathrm {n}\) and work out the required derivative. That is

$$\begin{aligned} C_s^2 = \sigma ^2 = {n\over \mu } {d \mu \over dn} = {dp \over d\varepsilon }. \end{aligned}$$
(8.33)

In order to ensure that the behaviour of the system is “physical”, we need to consider two conditions:

  1. 1.

    absolute stability, \(\sigma ^2 \ge 0\) , and

  2. 2.

    causality, \(C^2_s \le 1\) .

These conditions provide constraints which can be imposed on, say, parameters in equation of state models, the net effect being absolute limits on the possible forms for the master function \(\varLambda \). As an example, take the result from Eq. (8.32) and impose the two constraints to find that

$$\begin{aligned} 0 \le 1 + \frac{d \ln \mathcal {B}^{\mathrm {n}}}{d \ln n_\mathrm {n}} \le 1 \quad \implies \quad - 1 \le \frac{d \ln \mathcal {B}^{\mathrm {n}}}{d \ln n_\mathrm {n}} \le 0 . \end{aligned}$$
(8.34)

From the definition of \(\mathcal {B}^\mathrm {n}\), cf. Eq. (8.13), we have two bounds on \(\varLambda \).

Even with the aid of the constraint from Eq. (8.34), the mode frequency solution in Eq. (8.32) is obviously less transparent than the simple statement of the speed of sound as the variation of the pressure with changing density. However, as we will establish, the formalism we are developing readily deals with much more complex situations (such as multiple sound speeds and so-called “two-stream” instabilities). The main reason is that the fluxes enter the formalism on equal footing as four-vectors, whereas starting with energy density typically requires the introduction of an ad-hoc reference frame (e.g., the \(U^a\) from Sect. 5), in order to define what the energy density is, and any independent fluid motion (like heat flow) is then defined as a three-velocity with respect to this frame.

As a further example, let us consider the case when there are the two constituents with densities \(n_\mathrm {n}\) and \(n_\mathrm {s}\), two conserved density currents \(n^a_\mathrm {n}\) and \(n^a_\mathrm {s}\), two chemical potential covectors \(\mu ^\mathrm {n}_a\) and \(\mu ^\mathrm {s}_a\), but still only one four-velocity \(u^a\). (We are primarily thinking about matter and entropy, as before, but it could be any two individually conserved components which move together.) The matter Lagrangian \(\varLambda \) may now depend on both \(n^2_{\mathrm {n}}\) and \(n^2_{\mathrm {s}}\) meaning that

$$\begin{aligned} \delta \mu ^{\mathrm {x}}_a = \mathcal {B}^{{\mathrm {x}}}_{a b} \delta n^b_{\mathrm {x}}+ \mathcal{X}^{{\mathrm {x}}{\mathrm {y}}}_{a b} \delta n^b_{\mathrm {y}}, \quad {\mathrm {y}}\ne {\mathrm {x}}, \end{aligned}$$
(8.35)

where we recall that summation is not implied for repeated constituent indices, and we have defined

$$\begin{aligned} \mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}= - \mathcal {C}_{cc}\sqrt{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}} u^{\mathrm {x}}_a u^{\mathrm {x}}_b , \end{aligned}$$
(8.36)

(with \(u^{\mathrm {x}}_a = u^{\mathrm {y}}_a = u_a\) in this specific example) where

$$\begin{aligned} \mathcal {C}_{cc}^2 \equiv \frac{1}{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}} \left( 2 n_{\mathrm {x}}n_{\mathrm {y}}\frac{\partial \mathcal {B}^{\mathrm {x}}}{\partial n_{\mathrm {y}}^2}\right) ^2 . \end{aligned}$$
(8.37)

The \(\mathcal {B}^{\mathrm {n}}_{a b}\) coefficient is defined as before and \(\mathcal {B}^{\mathrm {s}}_{a b}\) is given by the same expression (Eq. (8.30)) with each \(\mathrm {n}\) replaced by \(\mathrm {s}\). The \(\mathcal {C}_{cc}\) coefficient represents a true multi-constituent effect, which depends on the composition (e.g., the entropy per baryon \(x_\mathrm {s}= n_\mathrm {s}/n_\mathrm {n}\) used in the discussion surrounding Eq. (8.10)).

The fact that \(n^a_\mathrm {s}\) is parallel to \(n^a_\mathrm {n}\) implies that it is only the magnitude of the entropy density current that is independent. One can show that the condition of transverse propagation, as applied to both currents, implies

$$\begin{aligned} \delta n^a_\mathrm {s}= x_\mathrm {s}\delta n^a_\mathrm {n}. \end{aligned}$$
(8.38)

It is worth taking a closer look at this condition. First of all, the time component leads to

$$\begin{aligned} \delta n_\mathrm {s}= x_\mathrm {s}\delta n_\mathrm {n}= \frac{n_\mathrm {s}}{n_\mathrm {n}} \delta n_\mathrm {n}\qquad \Longrightarrow \qquad \delta x_\mathrm {s}= 0 . \end{aligned}$$
(8.39)

That is, the entropy per particle is constant—the perturbations are adiabatic. Meanwhile, it is easy to show that the spatial part of (8.38) is trivial, since the two components move together.

Now, we proceed as in the previous example. Noting that the equation of motion is

$$\begin{aligned} \delta f^\mathrm {n}_a + \delta f^\mathrm {s}_a = 0, \end{aligned}$$
(8.40)

we find

$$\begin{aligned} \left[ \left( \mathcal {B}^{\mathrm {n}} + x^2_\mathrm {s}\mathcal {B}^{\mathrm {s}}\right) \sigma ^2 - \left( \mathcal {B}^{\mathrm {n}} c^2_\mathrm {n}+ x^2_\mathrm {s}\mathcal {B}^{\mathrm {s}} c^2_\mathrm {s}- 2 x_{\mathrm {s}} \mathcal{X}^{\mathrm {n}\mathrm {s}}_{0 0} \right) \right] k_i \delta n^i_\mathrm {n}= 0 , \end{aligned}$$
(8.41)

where, inspired by the result for the speed of sound in the single component case [cf. Eq. (8.32)], we have defined

$$\begin{aligned} c^2_{\mathrm {x}}\equiv 1 + \frac{\partial \ln \mathcal {B}^{{\mathrm {x}}}}{\partial \ln n_{\mathrm {x}}} . \end{aligned}$$
(8.42)

We find that the speed of sound is given by

$$\begin{aligned} C_s^2 = \sigma ^2 = \frac{\mathcal {B}^{\mathrm {n}} c^2_\mathrm {n}+ x^2_\mathrm {s}\mathcal {B}^{\mathrm {s}} c^2_\mathrm {s}- 2 x_\mathrm {s}\mathcal{X}^{\mathrm {n}\mathrm {s}}_{0 0}}{\mathcal {B}^{\mathrm {n}} + x^2_\mathrm {s}\mathcal {B}^{\mathrm {s}}} . \end{aligned}$$
(8.43)

As this result looks quite complicated, let us see if we can manipulate it to make it more intuitive. The obvious starting point is to replace the abstract coefficients we have introduced with the underlying thermodynamical quantities, i.e. use \(\mu _n = n_\mathrm {n}\mathcal {B}^\mathrm {n}= \mu \) and \(\mu _\mathrm {s}= n_\mathrm {s}\mathcal {B}^\mathrm {s}= T\) leading to

$$\begin{aligned} c_\mathrm {n}^2 = {n\over \mu } \left( {\partial \mu \over \partial n} \right) _s \qquad \text{ and } \qquad c_\mathrm {s}^2 = {s\over T} \left( {\partial T \over \partial s} \right) _n . \end{aligned}$$
(8.44)

We also see that

$$\begin{aligned} \mathcal{X}^{\mathrm {n}\mathrm {s}}_{0 0} = - \left( {\partial \mu \over \partial s} \right) _n = - \left( {\partial T \over \partial n} \right) _s , \end{aligned}$$
(8.45)

where the identity follows since we have mixed partial derivatives (both \(\mu \) and T arise as derivatives of \(\varepsilon \)). Given these results, we find that

$$\begin{aligned} C_s^2 = {1 \over p+\varepsilon } \left[ n^2 \left( {\partial \mu \over \partial n} \right) _s + 2sn \left( {\partial T \over \partial n} \right) _s + s^2 \left( {\partial T \over \partial s} \right) _n \right] , \end{aligned}$$
(8.46)

which already looks a little bit more transparent. However, we can also use the fact that \(dp = nd\mu +sdT\) to rewrite this as

$$\begin{aligned} C_s^2 = {1 \over p+\varepsilon } \left[ n \left( {\partial p \over \partial n} \right) _s + s \left( {\partial p \over \partial s} \right) _n \right] . \end{aligned}$$
(8.47)

Finally, let us ask what happens if we work with \(x_\mathrm {s}\) instead of s.

To do this, we need

$$\begin{aligned} dp= & {} \left( {\partial p \over \partial n} \right) _{x_\mathrm {s}} dn + \left( {\partial p \over \partial x_\mathrm {s}} \right) _n d x_\mathrm {s}\nonumber \\= & {} \left[ \left( {\partial p \over \partial n} \right) _{x_\mathrm {s}} - {s \over n^2} \left( {\partial p \over \partial x_\mathrm {s}} \right) _{n} \right] dn + {1 \over n} \left( {\partial p \over \partial s} \right) _n ds . \end{aligned}$$
(8.48)

From this we see that

$$\begin{aligned} \left( {\partial p \over \partial n} \right) _{x_\mathrm {s}} = \left( {\partial p \over \partial n} \right) _s + {s \over n} \left( {\partial p \over \partial s} \right) _n \end{aligned}$$
(8.49)

and once we combine with the fact that, when \(x_\mathrm {s}\) is kept constant we have

$$\begin{aligned} d\varepsilon = {p+\varepsilon \over n} dn , \end{aligned}$$
(8.50)

we get the expected result for the adiabatic sound speed:

$$\begin{aligned} C_s^2 = \left( {\partial p \over \partial \varepsilon } \right) _{x_\mathrm {s}} . \end{aligned}$$
(8.51)

8.3 Multi-component cosmology

The modern description of cosmology draws on ideas from fluid dynamics. In the simplest picture—after averaging up to a suitably large scale—planets, stars and galaxies are treated as collisionless “dust”, represented by the simple stress-energy tensor

$$\begin{aligned} T^{ab} = \varepsilon u^a u^b . \end{aligned}$$
(8.52)

This introduces a natural flow of cosmological time—associated with the proper time linked to \(u^a\)—and the associated fibration of spacetime (Barrow et al. 2007). The focus on the “fluid observer” worldlines means that the model is closely related to our description of fluid dynamics, and it is fairly straightforward to build more complex (read:realistic) models by, for example, adding the cosmological constant to the Einstein equations (or viewing it as a “dark energy” contribution with negative pressure, \(p=-\varepsilon \)) or accounting for more complicated description of the matter content in the Universe. The matter description relies on ideas we have already introduced. In particular, the cosmological principle states that the Universe is homogeneous and isotropic, suggesting that the relevant matter Lagrangian should be built from scalars. Given the increased quality of cosmological observations, this fundamental principle is now becoming testable, and (perhaps) questionable.

The most pressing issues that arise in cosmology relate to the simple fact that we do not have a good handle on the nature of dark components that appear to dominate the “standard model” (Peter and Uzan 2009). A number of alternative models—including alternatives to Einstein’s relativistic gravity—have been suggested, but few of these are compelling. The treatment of the different matter components, in particular, tends to remain based on the notion of coupled perfect fluids or scalar fields. If we are to understand the bigger picture, we may need to review this aspect, especially if we want to be able to consider issues like heat flow (Modak 1984; Triginer and Pavón 1995; Andersson and Lopez-Monsalvo 2011), dissipative mechanisms (Weinberg 1971; Patel and Koppar 1991; Velten and Schwarz 2011), Bose–Einstein condensation of dark matter (Sikivie and Yang 2009; Harko 2011) and possibly many others. Many issues are similar to ones that arise in more realistic models of neutron star astrophysics.

A particularly interesting aspect, given the focus of this review, may be the suggestion that there could have been phases during which the Universe would have effectively been anisotropic (see Tsagas et al. 2008 for a useful review), with different components evolving “independently” (Comer et al. 2012a, b). For the most part, models considered in the current literature, including initially anisotropic geometries, describe the matter content in terms of either effectively many component single fluid models (Gromov et al. 2004), or a single component (Gümrükçüoglu et al. 2007; Pitrou et al. 2008; Kim and Minamitsuji 2010); although an evolution towards isotropy is expected in such settings, as required to end up with a realistic (read: in agreement with observational data) model (Dechant et al. 2009). Having said that, interesting new consequences may be inferred by enhancing an initially vanishingly small non-Gaussian signal (Dey and Paban 2012).

Within this context, it is relevant to ask how distinct fluid flows may lead to anisotropy, with the spacetime metric taking the form of a Bianchi I solution of the Einstein equations. In this case there is a spacelike privileged vector, associated with the relative flow between two matter components. As we will soon establish, such a feature is natural in the multi-fluid context, but it can never arise in the usual multi-constituent single fluid. This point has been considered in some detail in Comer et al. (2012a, 2012b). It has been suggested (Barrow and Tsagas 2007; Adhav et al. 2011; Cataldo et al. 2011) that, since Bianchi universes—seen as averaged inhomogeneous and anisotropic spacetimes—can have effective strong energy condition violating stress-energy tensors, they could be part of a backreaction driven acceleration model.

Yet another reason for studying such cosmological models stem, perhaps surprisingly, from the observations: Large angle anomalies in the Cosmic Microwave Background (CMB) have been observed and discussed for quite some time (Schwarz et al. 2004; Copi et al. 2010; Perivolaropoulos 2011; Ma et al. 2011) and may be related with underlying Bianchi models (Pontzen and Challinor 2007; Pontzen 2009).

9 The “pull-back” formalism for two fluids

Having discussed the single fluid model, and how one accounts for stratification (either thermal or composition gradients), it is time to move on to the problem of modelling multi-fluid systems. We will experience for the first time novel effects due to a relative flow between two interpenetrating fluids, and the fact that there is no longer a single, preferred rest-frame. This kind of formalism is necessary, for example, for the simplest model of a neutron star, since it is generally accepted that the inner crust is permeated by an independent neutron superfluid, and the outer core is thought to contain superfluid neutrons, superconducting protons, and a highly degenerate gas of electrons. Still unknown is the number of independent fluids required for neutron stars that have deconfined quark matter in the deep core (Alford et al. 2000). The model can also be used to describe superfluid Helium and heat-conducting fluids, problems which relate to the incorporation of dissipation (see Sect. 16). We will focus on this example here, as a natural extension of the case considered in the previous section. It should be noted that, even though the particular system we concentrate on consists of only two fluids, it illustrates all new features of a general multi-fluid system. Conceptually, the greatest step is to go from one to two fluids. A generalization to a system with further degrees of freedom is straightforward.

In keeping with the previous section, we will rely on use of constituent indices, which throughout this section will range over \({\mathrm {x}},{\mathrm {y}}= \mathrm {n},\mathrm {s}\). In the example we consider the two fluids represent the particles (\(\mathrm {n}\)) and the entropy (\(\mathrm {s}\)). Once again, the number density four-currents, to be denoted \(n^a_{\mathrm {x}}\), are taken to be separately conserved, meaning that

$$\begin{aligned} \nabla _a n^a_{\mathrm {x}}= 0 . \end{aligned}$$
(9.1)

As before, we use the dual formulation, i.e., introduce the three-forms

$$\begin{aligned} n^{\mathrm {x}}_{a b c} = \epsilon _{d a b c } n^d_{\mathrm {x}}, \qquad n^a_{\mathrm {x}}= \frac{1}{3!} \epsilon ^{b c d a} n^{\mathrm {x}}_{b c d}. \end{aligned}$$
(9.2)

Also like before, the conservation rules are equivalent to the individual three-forms being closed (the arguments proceeds in exactly the same way); i.e.

$$\begin{aligned} \nabla _{[a} n^{\mathrm {x}}_{b c d]} = 0. \end{aligned}$$
(9.3)

However, we need a formulation whereby such conservation obtains automatically, at least in principle.

We make this happen by introducing the three-dimensional matter space, the difference being that we now need two such spaces. These will be labelled by coordinates \(X^A_\mathrm {x}\), and we recall that \(A,B,C,\mathrm {etc.} = 1,2,3\). The idea is illustrated in Fig. 12, which indicates the important facts that (i) a given point in space can be intersected by each fluid’s worldline and (ii) the individual worldlines are not necessarily parallel at the intersection, i.e., the independent fluids are interpenetrating and can exhibit a relative flow with respect to each other. Although we have not indicated this in Fig. 12 (in order to keep the figure as uncluttered as possible) attached to each worldline of a given constituent will be a fixed number of particles \(N^\mathrm {x}_1\), \(N^\mathrm {x}_2\), etc. (cf. Fig. 10). For the same reason, we have also not labelled (as in Fig. 10) the “pull-backs” (represented by the arrows) from the matter spaces to spacetime.

Fig. 12
figure 12

The pull-back from a point in the \({\mathrm {x}}^{ th }\)-constituent’s three-dimensional matter space (on the left) to the corresponding “fluid-particle” worldline in spacetime (on the right). The points in matter space are labelled by the coordinates \(\{X^1_{\mathrm {x}},X^2_{\mathrm {x}},X^3_{\mathrm {x}}\}\), and the constituent index \({\mathrm {x}}= \mathrm {n},\mathrm {s}\). There exist as many matter spaces as there are dynamically independent fluids, which for this case means two

By “pushing forward” each constituent’s three-form onto its respective matter space we can once again construct three-forms that are automatically closed on spacetime, i.e., let

$$\begin{aligned} n^{\mathrm {x}}_{a b c} = \psi _{{\mathrm {x}}a}^A \psi _{{\mathrm {x}}b}^A \psi _{{\mathrm {x}}c}^C N^{\mathrm {x}}_{A B C} , \end{aligned}$$
(9.4)

where

$$\begin{aligned} \psi _{{\mathrm {x}}a}^A = {\partial X_{\mathrm {x}}^A \over \partial x^a } , \end{aligned}$$
(9.5)

and \(N^{\mathrm {x}}_{A B C}\) is completely antisymmetric in its indices and is a function only of the \(X^A_{\mathrm {x}}\). Using the same reasoning as in the single fluid case, the construction produces three-forms that are automatically closed, i.e., they satisfy Eq. (9.3) identically. If we let the scalar fields \(X^A_{\mathrm {x}}\) (as functions on spacetime) be the fundamental variables, they yield a representation for each particle number density current that is automatically conserved. The variations of the three-forms can now be derived by varying them with respect to the \(X^A_{\mathrm {x}}\).

The Lagrangian displacements on spacetime for each fluid, to be denoted \(\xi ^a_{\mathrm {x}}\), are related to the variations \(\delta X^A_{\mathrm {x}}\) via

$$\begin{aligned} \varDelta _{\mathrm {x}}X^A = \delta X^A_{\mathrm {x}}+\xi ^a_{\mathrm {x}}\partial _a X^A_{\mathrm {x}}= \delta X^A_{\mathrm {x}}+\xi ^a_{\mathrm {x}}\psi ^A_{{\mathrm {x}}a}= 0 . \end{aligned}$$
(9.6)

In general, the various single-fluid equations we have considered are easily extended to the two-fluid case, except that each displacement and four-current will now be associated with a constituent index, using the decomposition

$$\begin{aligned} n^a_{\mathrm {x}}= n_{\mathrm {x}}u^a_{\mathrm {x}}, \qquad u^{\mathrm {x}}_a u^a_{\mathrm {x}}= - 1 . \end{aligned}$$
(9.7)

Associated with each constituent’s Lagrangian displacement is its own Lagrangian variation. As above, these are naturally defined to be

$$\begin{aligned} \varDelta _{\mathrm {x}}\equiv \delta + \mathcal{L}_{\xi _{\mathrm {x}}}, \end{aligned}$$
(9.8)

so that it follows that

$$\begin{aligned} \varDelta _\mathrm {x}n^{\mathrm {x}}_{a b c} = 0, \end{aligned}$$
(9.9)

as expected for the pull-back construction. Likewise, two-fluid analogues of Eqs. (6.406.42) exist which take the same form except that the constituent index is attached. However, in contrast to the ordinary fluid case, there are more options to consider. For instance, we could also look at the Lagrangian variation of the first constituent with respect to the second constituent’s flow, i.e., \(\varDelta _\mathrm {s}n_\mathrm {n}\), or the other way around, i.e., \(\varDelta _\mathrm {n}n_\mathrm {s}\). The Newtonian analogues of these Lagrangian displacements were essential to an analysis of instabilities in rotating superfluid neutron stars (Andersson et al. 2004a).

We are now in a position to construct an action principle that yields the equations of motion and the stress-energy tensor. Again, the central quantity is the matter Lagrangian \(\varLambda \), which is now a function of all the different scalars that can be formed from the \(n^a_{\mathrm {x}}\), i.e., the scalars \(n_{\mathrm {x}}\) together with

$$\begin{aligned} n^2_{{\mathrm {x}}{\mathrm {y}}} = n^2_{{\mathrm {y}}{\mathrm {x}}} = - g_{a b} n^a_\mathrm {x}n^b_\mathrm {y}. \end{aligned}$$
(9.10)

In the limit where all the currents are parallel, i.e., the fluids are comoving, \(- \varLambda \) corresponds (As before) to the local thermodynamic energy density. In the action principle, \(\varLambda \) is the Lagrangian density for the fluids.

figure m

An unconstrained variation of \(\varLambda \) with respect to the independent vectors \(n^a_{\mathrm {x}}\) and the metric \(g_{a b}\) takes the form

$$\begin{aligned} \delta \varLambda = \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} \mu ^{\mathrm {x}}_a \, \delta n^a_{\mathrm {x}}+ \frac{1}{2} \left( \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} n^a_{\mathrm {x}}\mu ^b_{\mathrm {x}}\right) \delta g_{a b}, \end{aligned}$$
(9.11)

where

$$\begin{aligned} \mu ^{\mathrm {x}}_a= & {} \mathcal {B}^{{\mathrm {x}}} n_a^{\mathrm {x}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}} n_a^{\mathrm {y}}, \end{aligned}$$
(9.12)
$$\begin{aligned} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}= & {} \mathcal {A}^{{\mathrm {y}}{\mathrm {x}}} = - \frac{\partial \varLambda }{\partial n^2_{{\mathrm {x}}{\mathrm {y}}}}, \qquad \mathrm {for\ } {\mathrm {x}}\ne {\mathrm {y}}. \end{aligned}$$
(9.13)

The momentum covectors \(\mu ^{\mathrm {x}}_a\) are each dynamically, and thermodynamically, conjugate to their respective number density currents \(n^a_{\mathrm {x}}\), and their magnitudes are the chemical potentials. Here we note something new: the \(\mathcal {A}^{\mathrm {x}\mathrm {y}}\) coefficient represents the fact that each fluid momentum \(\mu ^{\mathrm {x}}_a\) may, in general, be given by a linear combination of the individual currents \(n^a_{\mathrm {x}}\). That is, the current and momentum for a particular fluid do not have to be parallel. This is known as the entrainment effect. We have chosen to represent it by the letter \(\mathcal {A}\) for historical reasons. When Carter first developed his formalism he opted for this notation, referring to the “anomaly” of having misaligned currents and momenta. It has since been realized that the entrainment is a key feature of most multi-fluid systems and it would, in fact, be anomalous to leave it out!

In the general case, the momentum of one constituent carries along some mass current of the other constituents. The entrainment only vanishes in the special case where \(\varLambda \) is independent of \(n^2_{{\mathrm {x}}{\mathrm {y}}}\) (\({\mathrm {x}}\ne {\mathrm {y}}\)) because then we obviously have \(\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}} = 0\). Entrainment is an observable effect in laboratory superfluids (Putterman 1974; Tilley and Tilley 1990) (e.g., via flow modifications in superfluid \({}^4{\mathrm {He}}\) and mixtures of superfluid \({}^3{\mathrm {He}}\) and \({}^4{\mathrm {He}}\)). In the case of neutron stars, entrainment—in this case related to the mobility of the superfluid neutrons that permeate the neutron star crust—plays a key role in the discussion of pulsar glitches glitches (Radhakrishnan and Manchester 1969; Reichley and Downs 1969). As we will see later (in Sect. 15), these “anomalous” terms are necessary for causally well-behaved heat conduction in relativistic fluids, and by extension necessary for building well-behaved relativistic equations that incorporate dissipation (see also Andersson and Comer 2010, 2011).

In terms of the constrained Lagrangian displacements, a variation of \(\varLambda \) now yields

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right)= & {} \frac{1}{2} \sqrt{- g} \left( \varPsi g^{a b} + \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} n^a_{\mathrm {x}}\mu ^b_{\mathrm {x}}\right) \delta g_{a b} - \sqrt{- g} \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} f^{\mathrm {x}}_a \xi ^a_{\mathrm {x}}\nonumber \\&+ \nabla _a \left( \frac{1}{2} \sqrt{-g} \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} \mu ^{a b c}_{\mathrm {x}}n^{\mathrm {x}}_{b c d} \xi ^d_{\mathrm {x}}\right) , \end{aligned}$$
(9.14)

where \(f^{\mathrm {x}}_a\) is as defined in Eq. (8.20) except that the individual velocities are no longer parallel. The generalized pressure \(\varPsi \) is now

$$\begin{aligned} \varPsi = \varLambda - \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} n^a_{\mathrm {x}}\mu ^{\mathrm {x}}_a. \end{aligned}$$
(9.15)

At this point we return to the view that \(n^a_\mathrm {n}\) and \(n^a_\mathrm {s}\) are the fundamental variables. Because the \(\xi ^a_{\mathrm {x}}\) are independent variations, the equations of motion consist of the two original conservation conditions from Eq. (6.8), plus two Euler-type equations

$$\begin{aligned} f^{\mathrm {x}}_a = n_{\mathrm {x}}^b \omega ^{\mathrm {x}}_{ba} = 0 , \end{aligned}$$
(9.16)

and of course the Einstein equations (obtained exactly as before by adding in the Einstein–Hilbert term, see Sect. 4.4). We also find that the stress-energy tensor is

$$\begin{aligned} T^a{}_b = \varPsi \delta ^a{}_b + \sum _{{\mathrm {x}}= \{\mathrm {n},\mathrm {s}\}} n^a_{\mathrm {x}}\mu ^{\mathrm {x}}_b. \end{aligned}$$
(9.17)

When the complete set of field equations is satisfied then it is automatically true that \(\nabla _b T^b{}_a = 0\). One can also verify that \(T_{a b}\) is symmetric. The momentum form \(\mu ^{a b c}_{\mathrm {x}}\) entering the boundary term is the natural extension of Eq. (8.15) to this two-fluid case.

It must be noted that Eq. (9.16) is significantly different from the multi-constituent version from Eq. (8.22). This is true even if one is solving for a static and spherically symmetric configuration, where the fluid four-velocities would all necessarily be parallel. Simply put, Eq. (9.16) represents two independent equations. If one takes entropy as an independent fluid, then the static and spherically symmetric solutions will exhibit thermal equilibrium (Comer et al. 1999). This explains, for instance, why one must specify an extra condition (e.g., convective stability; Weinberg 1972) to solve for a double-constituent star with only one four-velocity.

10 Waves in multi-fluid systems

Crucial to the understanding of black holes is the effect of spacetime curvature on the light-cone structure, that is, the null vectors that emanate from each spacetime point. Crucial to the propagation of massless fields (and gravitational waves!) is the light-cone structure. In the case of fluids, it is both the speed of light and the speed (and/or speeds) of sound that dictate how waves propagate through the matter. We have already used a local analysis of plane-wave propagation to derive the speed of sound for both the single-fluid case (in Sect. 7.2) and the two-constituent single-fluid case (in Sect. 8.2). We will now repeat the analysis for a general two-fluid system, using the same assumptions as before (see Carter 1989a for a more rigorous derivation). However, we will provide an important extension by allowing a relative flow between the two fluids in the background/equilibrium state. While this extension is straight-forward, we will see that the final results are quite astonishing—demonstrating the existence of a two-stream instability.

10.1 Two-fluid case

As a reminder, we first note that the analysis is, in principle, performed in a small region (where the meaning of “small” is dictated by the particular system being studied) and we assume that the configuration of the matter with no waves present is locally isotropic, homogeneous, and static. Thus, for the background, \(n^a_{\mathrm {x}}= [n_{\mathrm {x}},0,0,0]\) and the vorticity \(\omega ^{\mathrm {x}}_{a b}\) vanishes. The linearized fluxes take the plane-wave form given in Eq. (8.27).

The two-fluid problem is qualitatively different from the previous cases, since there are now two independent currents. This impacts on the analysis in two crucial ways: (i) The Lagrangian \(\varLambda \) depends on \(n^2_{\mathrm {n}}\), \(n^2_{\mathrm {s}}\), and \(n^2_{\mathrm {n}\mathrm {s}} = n^2_{\mathrm {s}\mathrm {n}}\) (i.e. entrainment is present), and (ii) the equations of motion, after taking into account the transverse flow condition of Eq. 8.28 for both fluids, are doubled to \(\delta f^\mathrm {n}_a = 0 = \delta f^\mathrm {s}_a\). The key point is that there can be two simultaneous wave propagations, with each distinct mode having its own sound speed.

Another ramification of having two fluids is that the variation \(\delta \mu ^{\mathrm {x}}_a\) has more terms than in the previous, single-fluid analysis. There are individual fluid bulk effects, cross-constituent effects due to coupling between the fluids, and entrainment. We can isolate these various effects by writing \(\delta \mu ^{\mathrm {x}}_a\) in the form

$$\begin{aligned} \delta \mu _a^{\mathrm {x}}= \left( \mathcal {B}^{\mathrm {x}}_{a b}+ {\mathcal {A}}^{\mathrm {x}}_{a b}\right) \delta n^b_{\mathrm {x}}+ \left( \mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}+ {\mathcal {A}}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\right) \delta n^b_{\mathrm {y}}. \end{aligned}$$
(10.1)

The bulk effects are contained in

$$\begin{aligned} \mathcal {B}^{\mathrm {x}}_{a b}= \mathcal {B}^{\mathrm {x}}\left( \perp ^{\mathrm {x}}_{a b} - c^2_{\mathrm {x}}u^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) , \end{aligned}$$
(10.2)

which is just the two-fluid extension of Eq. (8.30) [with \(\mathrm {n}\) replaced by \({\mathrm {x}}\) and using Eq. (8.42)]. The cross-constituent coupling enters via \(\mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\) [defined already in Eq. (8.36)]. Finally, entrainment enters through the coefficients \({\mathcal {A}}^{\mathrm {x}}_{a b}\) and \({\mathcal {A}}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\) given by, respectively,

$$\begin{aligned} {\mathcal {A}}^{\mathrm {x}}_{a b}= & {} - \left[ \mathcal {B}^{\mathrm {x}}_{, {\mathrm {x}}{\mathrm {y}}} \left( u^{\mathrm {x}}_a u^{\mathrm {y}}_b + u^{\mathrm {x}}_b u^{\mathrm {y}}_a \right) + \frac{n_{\mathrm {y}}}{n_{\mathrm {x}}} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{, {\mathrm {x}}{\mathrm {y}}} u^{\mathrm {y}}_a u^{\mathrm {y}}_b \right] , \end{aligned}$$
(10.3)
$$\begin{aligned} {\mathcal {A}}^{{\mathrm {x}}{\mathrm {y}}}_{a b}= & {} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\perp ^{\mathrm {x}}_{a b} \nonumber \\&- \left[ \left( \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}+ \frac{n_{\mathrm {x}}}{n_{\mathrm {y}}} \mathcal {B}^{\mathrm {x}}_{, {\mathrm {x}}{\mathrm {y}}}\right) u^{\mathrm {x}}_a u^{\mathrm {x}}_b + \frac{n_{\mathrm {y}}}{n_{\mathrm {x}}} \mathcal {B}^{\mathrm {y}}_{, {\mathrm {x}}{\mathrm {y}}} u^{\mathrm {y}}_a u^{\mathrm {y}}_b + \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{, {\mathrm {x}}{\mathrm {y}}} u^{\mathrm {y}}_a u^{\mathrm {x}}_b\right] , \end{aligned}$$
(10.4)

where we have introduced the notation

$$\begin{aligned} \mathcal {B}^{\mathrm {x}}_{, {\mathrm {x}}{\mathrm {y}}} \equiv n_{\mathrm {x}}n_{\mathrm {y}}\frac{\partial \mathcal {B}^{\mathrm {x}}}{\partial n_{{\mathrm {x}}{\mathrm {y}}}^2} , \end{aligned}$$
(10.5)

and

$$\begin{aligned} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{, {\mathrm {x}}{\mathrm {y}}} \equiv n_{\mathrm {x}}n_{\mathrm {y}}\frac{\partial \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}}{\partial n_{{\mathrm {x}}{\mathrm {y}}}^2} . \end{aligned}$$
(10.6)

The same procedure as in the previous two examples—the single fluid with one and then two constituents—leads to the dispersion relation

$$\begin{aligned}&\left( \mathcal {B}^{\mathrm {n}} \sigma ^2 - \left[ \mathcal {B}^{\mathrm {n}}_{0 0} + \mathcal {A}^{\mathrm {n}\mathrm {n}}_{0 0} \right] \right) \left( \mathcal {B}^{\mathrm {s}} \sigma ^2 - \left[ \mathcal {B}^{\mathrm {s}}_{0 0} + \mathcal {A}^{\mathrm {s}\mathrm {s}}_{0 0} \right] \right) \nonumber \\&\quad - \left( \mathcal {A}^{\mathrm {n}\mathrm {s}} \sigma ^2 - \left[ \mathcal{X}^{\mathrm {n}\mathrm {s}}_{0 0} + \mathcal {A}^{\mathrm {n}\mathrm {s}}_{0 0} \right] \right) ^2 = 0 , \end{aligned}$$
(10.7)

recalling from Eq. (8.32) that \(\sigma ^2 = k^2_0/k_i k^i\). This is a quadratic in \(\sigma ^2\), meaning that there are two sound speeds. This is a natural result of the doubling of fluid degrees of freedom.

To finish this discussion of local mode solutions in the two-fluid problem, it is useful to consider what constraints the simplest solutions of zero interaction impose on the equation of state. The dispersion relation becomes simply

$$\begin{aligned} (\sigma ^2 - c_\mathrm {n}^2)(\sigma ^2 - c_\mathrm {s}^2) = 0 , \end{aligned}$$
(10.8)

so the mode speed solutions \(\sigma _\mathrm {n}\) and \(\sigma _\mathrm {s}\) are

$$\begin{aligned} \sigma ^2_\mathrm {n}= c^2_\mathrm {n}= 1+ \frac{\partial \log \mathcal {B}^\mathrm {n}}{\partial \log n_\mathrm {n}} , \quad \sigma ^2_\mathrm {s}= c^2_\mathrm {s}= 1+ \frac{\partial \log \mathcal {B}^\mathrm {s}}{\partial \log n_\mathrm {s}} . \end{aligned}$$
(10.9)

The constraints of absolute stability and causality implies that \(\varLambda \) must be such that

$$\begin{aligned} - 1 \le \frac{\partial \log \mathcal {B}^\mathrm {n}}{\partial \log n} \le 0 , \quad - 1 \le \frac{\partial \log \mathcal {B}^\mathrm {s}}{\partial \log s} \le 0 . \end{aligned}$$
(10.10)

A general analysis which keeps in entrainment and cross-constituent coupling has been performed by Samuelsson et al. (2010).

While the sound speed analysis is local, the doubling of the fluid degrees of freedom naturally carries over to the global scale relevant for the analysis of modes of oscillation of a fluid body.

figure n

10.2 The two-stream instability

Consider a system having two components between which there can be a relative flow, such as ions and electrons in a plasma, entropy and matter in a superfluid, or even the rotation of a neutron star as viewed from asymptotically flat infinity. If the relative flow reaches a speed where a mode in one of the components looks like it is going one direction with respect to that component, but the opposite direction with respect to the other component, then the mode will have a negative energy and become dynamically unstable. This kind of “two-stream” instability has a long history of investigation in the area of plasma physics (see Farley 1963; Buneman 1963). The Chandrasekhar–Friedman–Schutz (CFS) instability (Chandrasekhar 1970; Friedman and Schutz 1978a, b) (already discussed in Sect. 7.4) develops when a mode in a rotating star appears to be retrograde with respect to the star itself, and yet prograde with respect to an observer at infinity. The possible link between two-stream instability in the superfluid in the inner crust and pulsar glitches is more recent (Andersson et al. 2003, 2004b). Another relevant discussion considers a cosmological model consisting of a relative flow between matter and blackbody radiation (Comer et al. 2012a). Two-stream instability between two relativistic fluids in the linear regime has been examined in general by Samuelsson et al. (2010), and extended to the non-linear regime by Hawke et al. (2013). Finally, a discussion on the relationship between energetic and dynamical instabilities, starting from a Lagrangian for two complex scalar fields, was provided by Haber et al. (2016).

Repeating the key steps from Samuelsson et al. (2010), we start with a system having plane-wave propagation (as before, in a locally flat region of spacetime) on backgrounds such that \(\omega ^{\mathrm {x}}_{a b} = 0\). The various background quantities are considered constant, and there is a relative flow between the fluids. As in the previous sound-speed analyses, we let \(u^a_{\mathrm {x}}\) represent the background four-velocity of the \({\mathrm {x}}\)-fluid. Its total particle flux then takes the form

$$\begin{aligned} n^a_{\mathrm {x}}= n_{\mathrm {x}}u^a_{\mathrm {x}}+ A^a_{\mathrm {x}}\exp ^{i k_b x^b} , \end{aligned}$$
(10.11)

Because \(\omega ^{\mathrm {x}}_{a b} = 0\) for the background and there is flux conservation, the analysis still leads to the linearized equations;

$$\begin{aligned} \nabla _a \delta n^a_{\mathrm {x}}= 0 , \quad n_\mathrm {x}^a \nabla _{[a}\delta \mu ^{\mathrm {x}}_{b]} = 0 . \end{aligned}$$
(10.12)

The variation \(\delta \mu ^{\mathrm {x}}_a\) is the same as in Eq. (10.1).

However, the system flow is now such that \(u^a_{\mathrm {x}}\) does not equal \(u^a_{\mathrm {y}}\), the \({\mathrm {y}}\)-fluid four-velocity. There is a non-zero relative velocity of, say, the \({\mathrm {y}}\)-fluid with respect to the \({\mathrm {x}}\)-fluid given by

$$\begin{aligned} \gamma _{{\mathrm {x}}{\mathrm {y}}} v^a_{{\mathrm {x}}{\mathrm {y}}} = \perp ^{{\mathrm {x}}a}_b u^b_{\mathrm {y}}, \end{aligned}$$
(10.13)

where \(v_{{\mathrm {x}}{\mathrm {y}}} = v_{{\mathrm {y}}{\mathrm {x}}}\) represents the magnitude of the relative flow,

$$\begin{aligned} \perp ^{{\mathrm {x}}b}_a = \delta _a{}^b + u^{\mathrm {x}}_a u^b_{\mathrm {x}}, \quad \perp ^{{\mathrm {x}}b}_a u^a_{\mathrm {x}}= 0 , \end{aligned}$$
(10.14)

and

$$\begin{aligned} \gamma _{{\mathrm {x}}{\mathrm {y}}} = \gamma _{{\mathrm {y}}{\mathrm {x}}} = - u^c_{\mathrm {x}}u^{\mathrm {y}}_c = \frac{1}{\sqrt{1 - v^2_{{\mathrm {x}}{\mathrm {y}}}}} . \end{aligned}$$
(10.15)

This leads to (adapting (5.9) to the present context)

$$\begin{aligned} u^a_{\mathrm {y}}= \gamma _{{\mathrm {x}}{\mathrm {y}}} \left( u^a_{\mathrm {x}}+ v^a_{{\mathrm {x}}{\mathrm {y}}}\right) . \end{aligned}$$
(10.16)

For convenience, we will work in the material frame associated with the x fluid component, meaning that \(k_a\) and \(A^a_{\mathrm {x}}\) will be decomposed into timelike and spatial pieces as defined locally by \(u^a_{\mathrm {x}}\). For \(k_a\) we write

$$\begin{aligned} k_a = k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}u^{\mathrm {x}}_a + \hat{k}^{\mathrm {x}}_a\right) , \end{aligned}$$
(10.17)

where \(\sigma _{\mathrm {x}}\), \(k_{\mathrm {x}}\), and the unit wave vector \(\hat{k}^{\mathrm {x}}_a\) are obtained from \(k_a\) via

$$\begin{aligned} k_{\mathrm {x}}\sigma _{\mathrm {x}}= - k_a u^a_{\mathrm {x}}, \quad k^a k_a = - k^2_{\mathrm {x}}\left( 1 - \sigma ^2_{\mathrm {x}}\right) , \quad \hat{k}^{\mathrm {x}}_a = \frac{1}{k_{\mathrm {x}}} \perp ^b_{{\mathrm {x}}a} k_b. \end{aligned}$$
(10.18)

Similarly, the wave amplitude \(A^a_{\mathrm {x}}\) becomes

$$\begin{aligned} A^a_{\mathrm {x}}= A^{\mathrm {x}}_{||} u^a_{\mathrm {x}}+ A_{{\mathrm {x}}\perp }^a , \end{aligned}$$
(10.19)

where

$$\begin{aligned} A^{\mathrm {x}}_{||} = - u^{\mathrm {x}}_a A^a_{\mathrm {x}}, \quad A_{{\mathrm {x}}\perp }^a = \perp ^a_{{\mathrm {x}}b} A^b_{\mathrm {x}}. \end{aligned}$$
(10.20)

It is necessary to point out that the three quantities \(\sigma _{\mathrm {x}}\), \(k^{\mathrm {x}}_a\), and \(v^a_{{\mathrm {x}}{\mathrm {y}}}\) are determined by an observer moving along with the \({\mathrm {x}}\)-fluid. Of course, we could choose the frame attached to the other fluid. Fortunately, there are well-defined transformations between the two frames, which we determine as follows: The relative flow \(v^a_{{\mathrm {y}}{\mathrm {x}}}\) of the \({\mathrm {x}}^\mathrm{th}\)-fluid with respect to the \({\mathrm {y}}^\mathrm{th}\)-fluid frame is related to \(v^a_{{\mathrm {x}}{\mathrm {y}}}\) via

$$\begin{aligned} v^a_{{\mathrm {y}}{\mathrm {x}}} = - \gamma _{{\mathrm {x}}{\mathrm {y}}} \left( v^2_{{\mathrm {x}}{\mathrm {y}}} u^a_{\mathrm {x}}+ v^a_{{\mathrm {x}}{\mathrm {y}}}\right) , \end{aligned}$$
(10.21)

using the fact that \(v_{{\mathrm {y}}{\mathrm {x}}} = v_{{\mathrm {x}}{\mathrm {y}}}\). Since \(k_a\) is a tensor, we must have

$$\begin{aligned} k_a = k_{\mathrm {y}}\left( \sigma _{\mathrm {y}}u^{\mathrm {y}}_a + \hat{k}^{\mathrm {y}}_a\right) = k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}u^{\mathrm {x}}_a + \hat{k}^{\mathrm {x}}_a\right) . \end{aligned}$$
(10.22)

Noting that

$$\begin{aligned} u^a_{\mathrm {x}}= & {} - v^{- 2}_{{\mathrm {x}}{\mathrm {y}}} \left( v^a_{{\mathrm {x}}{\mathrm {y}}} + \gamma ^{- 1}_{{\mathrm {x}}{\mathrm {y}}} v^a_{{\mathrm {y}}{\mathrm {x}}}\right) , \end{aligned}$$
(10.23)
$$\begin{aligned} u^a_{\mathrm {y}}= & {} - v^{- 2}_{{\mathrm {x}}{\mathrm {y}}} \left( v^a_{{\mathrm {y}}{\mathrm {x}}} + \gamma ^{- 1}_{{\mathrm {x}}{\mathrm {y}}} v^a_{{\mathrm {x}}{\mathrm {y}}}\right) , \end{aligned}$$
(10.24)

and contracting each with the wave-vector \(k_a\), we obtain the matrix equation

$$\begin{aligned} \left[ \begin{array}{cc} v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}- \cos \theta _{{\mathrm {x}}{\mathrm {y}}} &{} - \gamma ^{- 1}_{{\mathrm {x}}{\mathrm {y}}} \cos \theta _{{\mathrm {y}}{\mathrm {x}}} \\ - \gamma ^{- 1}_{{\mathrm {x}}{\mathrm {y}}} \cos \theta _{{\mathrm {x}}{\mathrm {y}}} &{} v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {y}}- \cos \theta _{{\mathrm {y}}{\mathrm {x}}} \end{array}\right] \left[ \begin{array}{c} k_{\mathrm {x}}\\ k_{\mathrm {y}}\end{array}\right] = \left[ \begin{array}{c} 0 \\ 0 \end{array}\right] . \end{aligned}$$
(10.25)

The non-trivial solution requires that the determinant of the \(2 \times 2\) matrix vanishes; therefore,

$$\begin{aligned} \sigma _{\mathrm {y}}= \cos \theta _{{\mathrm {y}}{\mathrm {x}}} \frac{\sigma _{\mathrm {x}}- v_{{\mathrm {x}}{\mathrm {y}}} \cos \theta _{{\mathrm {x}}{\mathrm {y}}}}{v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}- \cos \theta _{{\mathrm {x}}{\mathrm {y}}}} . \end{aligned}$$
(10.26)

It is not difficult to show that if \(\sigma ^2_{\mathrm {x}}\le 1\) then \(\sigma ^2_{\mathrm {y}}\le 1\), and clearly if \(\sigma _{\mathrm {x}}\) is real then so is \(\sigma _{\mathrm {y}}\).

The equation of flux conservation is the same as (8.28) (except \({\mathrm {x}}\) ranges over two values). Here, it implies for each mode that

$$\begin{aligned} - \sigma _{\mathrm {x}}A^{\mathrm {x}}_{||} + \hat{k}^{\mathrm {x}}_a A_{{\mathrm {x}}\perp }^a = 0 . \end{aligned}$$
(10.27)

The two-fluid Euler equations become

$$\begin{aligned} 0= & {} K^{\mathrm {x}}_{a b} A^b_{\mathrm {x}}+ K^{{\mathrm {x}}{\mathrm {y}}}_{a b} A^b_{\mathrm {y}}, \end{aligned}$$
(10.28)
$$\begin{aligned} 0= & {} K^{\mathrm {y}}_{a b} A^b_{\mathrm {y}}+ K^{{\mathrm {y}}{\mathrm {x}}}_{a b} A^b_{\mathrm {x}}, \end{aligned}$$
(10.29)

where the “dispersion” tensors are

$$\begin{aligned} K^{\mathrm {x}}_{a b}= & {} n^c_{\mathrm {x}}\left( k_{[c} \mathcal {B}^{\mathrm {x}}_{a]b} + k_{[c} \mathcal {A}^{\mathrm {x}}_{a]b} \right) , \end{aligned}$$
(10.30)
$$\begin{aligned} K^{{\mathrm {x}}{\mathrm {y}}}_{a b}= & {} n^c_{\mathrm {x}}\left( k_{[c} \mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a]b} + k_{[c}\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{a]b}\right) . \end{aligned}$$
(10.31)

Note that \(K^{\mathrm {y}}_{a b}\) and \(K^{{\mathrm {y}}{\mathrm {x}}}_{a b}\) are obtained via the interchange of \({\mathrm {x}}\leftrightarrow {\mathrm {y}}\) in (10.30) and (10.31).

The general solution to (10.29) requires, say, using Eq. (10.29) to determine \(A^a_{\mathrm {y}}\), and then substitute that into Eq. (10.28). This means we need the four inverses

$$\begin{aligned} \tilde{K}^{a c}_{\mathrm {x}}K^{\mathrm {x}}_{c b} = \delta ^a{}_c , \quad \tilde{K}^{a c}_{{\mathrm {y}}{\mathrm {x}}} K^{{\mathrm {x}}{\mathrm {y}}}_{c b} = \delta ^a{}_c . \end{aligned}$$
(10.32)

With these in hand, we can write

$$\begin{aligned} 0 = \left( \tilde{K}^{a c}_{\mathrm {y}}K^{{\mathrm {y}}{\mathrm {x}}}_{c b} - \tilde{K}^{a c}_{{\mathrm {y}}{\mathrm {x}}} K^{\mathrm {x}}_{c b}\right) A^b_{\mathrm {x}}\equiv \mathcal{M}^a{}_b A^b_{\mathrm {x}}. \end{aligned}$$
(10.33)

Having a non-trivial solution requires that \(k_a\) be such that \(\det \mathcal {M}^a_{\ b} = 0 \). However, the examples which follow will be kept simple enough that the general procedure will not be required. For example, we will focus on the case of aligned flows.

Samuelsson et al. (2010) have shown that the relative flow between the two fluids enters through the inner product \(\hat{v}^a_{{\mathrm {x}}{\mathrm {y}}} \hat{k}^{\mathrm {x}}_a\) (where \(\hat{v}^a_{{\mathrm {x}}{\mathrm {y}}} = v^a_{{\mathrm {x}}{\mathrm {y}}}/v_{{\mathrm {x}}{\mathrm {y}}}\)), and so it is natural to introduce the angle \(\theta _{{\mathrm {x}}{\mathrm {y}}}\) between the two vectors. This means that, the inner product becomes

$$\begin{aligned} \hat{v}^a_{{\mathrm {x}}{\mathrm {y}}} \hat{k}^{\mathrm {x}}_a = \cos \theta _{{\mathrm {x}}{\mathrm {y}}} . \end{aligned}$$
(10.34)

Having an aligned flow means, say, setting \(\theta _{{\mathrm {x}}{\mathrm {y}}} = 0\) and \(\theta _{{\mathrm {y}}{\mathrm {x}}} = \pi \). The wave vector takes the form

$$\begin{aligned} k^a = \frac{1}{\gamma _{{\mathrm {x}}{\mathrm {y}}} v_{{\mathrm {x}}{\mathrm {y}}}} \left( k_{\mathrm {x}}u^a_{\mathrm {y}}- k_{\mathrm {y}}u^a_{\mathrm {x}}\right) , \end{aligned}$$
(10.35)

and the flux conservation becomes

$$\begin{aligned} k_{\mathrm {x}}u_a^{\mathrm {y}}A^a_{\mathrm {x}}= k_{\mathrm {y}}u_a^{\mathrm {x}}A^a_{\mathrm {x}}. \end{aligned}$$
(10.36)

This, in turn, implies that the problem is reduced from four equations with four unknowns to a much simpler \(2\times 2\) system. Finally, we note that Eqs. (10.22) and (10.26) imply, respectively,

$$\begin{aligned} \frac{k_{\mathrm {y}}}{k_{\mathrm {x}}} = \sqrt{\frac{1- \sigma ^2_{\mathrm {x}}}{1 - \sigma ^2_{\mathrm {y}}}} \end{aligned}$$
(10.37)

and

$$\begin{aligned} \sigma _{\mathrm {y}}= \frac{\sigma _{\mathrm {x}}- v_{{\mathrm {x}}{\mathrm {y}}}}{1 - v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}} . \end{aligned}$$
(10.38)

It will prove useful later to note that this last result implies

$$\begin{aligned} 1 - \sigma ^2_{\mathrm {y}}= \frac{1}{\gamma ^2_{{\mathrm {x}}{\mathrm {y}}}} \frac{1 - \sigma ^2_{\mathrm {x}}}{\left( 1 - v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}\right) ^2} \end{aligned}$$
(10.39)

and therefore

$$\begin{aligned} \frac{k_{\mathrm {y}}}{k_{\mathrm {x}}} = \gamma _{{\mathrm {x}}{\mathrm {y}}} \sqrt{\left( 1- v_{{\mathrm {x}}{\mathrm {y}}} \sigma _{\mathrm {x}}\right) ^2} . \end{aligned}$$
(10.40)

Another place where we will simplify the analysis is the choice of equation of state; namely, to consider forms with just enough complexity in the \(\mathcal {B}^{\mathrm {x}}_{a b}\), \(\mathcal {A}^{\mathrm {x}}_{a b}\), \(\mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\), and \(\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{a b}\) coefficients to establish the main feature we are interested in: the two-stream instability. Obviously, any fluid must have non-zero bulk properties; the other two properties of entrainment and cross-constituent coupling depend on the particular features of the fluid system incorporated into the equation of state. We will first consider the case where only bulk features are present and then follow this up by incorporating entrainment.

Let us first set both the entrainment and cross-constituent coupling to zero. This implies \(K^{{\mathrm {x}}{\mathrm {y}}}_{a b} = 0\) and the mode equations are

$$\begin{aligned} 0= & {} K^{\mathrm {x}}_{a b} A^b_{\mathrm {x}}= - \frac{1}{2} \mathcal {B}^{\mathrm {x}}n_{\mathrm {x}}k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + c^2_{\mathrm {x}}\hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) A^b_{\mathrm {x}}, \end{aligned}$$
(10.41)
$$\begin{aligned} 0= & {} K^{\mathrm {y}}_{a b} A^b_{\mathrm {y}}= - \frac{1}{2} \mathcal {B}^{\mathrm {y}}n_{\mathrm {y}}k_{\mathrm {y}}\left( \sigma _{\mathrm {y}}\perp ^{\mathrm {y}}_{a b} + c^2_{\mathrm {y}}\hat{k}^{\mathrm {y}}_a u^{\mathrm {y}}_b\right) A^b_{\mathrm {y}}. \end{aligned}$$
(10.42)

We contract each mode equation with \(k_a\) to find

$$\begin{aligned} 0 = \left( \sigma ^2_{\mathrm {x}}- c^2_{\mathrm {x}}\right) A^{\mathrm {x}}_{||} , \quad 0 = \left( \sigma ^2_{\mathrm {y}}- c^2_{\mathrm {y}}\right) A^{\mathrm {y}}_{||} , \end{aligned}$$
(10.43)

and, as the solution reduces to the \(2\times 2\) matrix problem

$$\begin{aligned} \left[ \begin{array}{cc} \left( \sigma ^2_{\mathrm {x}}- c^2_{\mathrm {x}}\right) &{} 0 \\ 0 &{} \left( \sigma ^2_{\mathrm {y}}- c^2_{\mathrm {y}}\right) \end{array}\right] \left[ \begin{array}{c} A^{\mathrm {x}}_{||} \\ A^{\mathrm {y}}_{||} \end{array}\right] = \left[ \begin{array}{c} 0 \\ 0 \end{array}\right] , \end{aligned}$$
(10.44)

it is easy to see that the resulting dispersion relation is

$$\begin{aligned} \left( \sigma ^2_{\mathrm {x}}- c^2_{\mathrm {x}}\right) \left( \sigma ^2_{\mathrm {y}}- c^2_{\mathrm {y}}\right) = 0 . \end{aligned}$$
(10.45)

The modes of this system are the “bare” sound waves with speeds \(c_{{\mathrm {x}}}\) or \(c_{\mathrm {y}}\), as one would have expected. There are no interactions between the two fluids and so there is no sense in which they “see” each other. Generally, we conclude that the existence of a two-stream instability requires more than just a background relative flow. Some coupling agent is required.

With this in mind, we include coupling via entrainment. As we are ignoring the cross-constituent coupling term we still have \(\mathcal {X}^{{\mathrm {x}}{\mathrm {y}}}_{a b}= 0\). The simplest inclusion of entrainment is to set \(\mathcal {B}^{\mathrm {x}}_{, {\mathrm {x}}{\mathrm {y}}} = 0\) and \(\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}_{, {\mathrm {x}}{\mathrm {y}}} = 0\). This means \({\mathcal {A}}^{\mathrm {x}}_{a b}= 0\), \({\mathcal {A}}^{{\mathrm {x}}{\mathrm {y}}}_{a b}= \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}g_{a b}\), and therefore

$$\begin{aligned} K^{\mathrm {x}}_{a b}= & {} - \frac{1}{2} \mathcal {B}^{\mathrm {x}}n_{\mathrm {x}}k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + c^2_{\mathrm {x}}\hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) , \end{aligned}$$
(10.46)
$$\begin{aligned} K^{{\mathrm {x}}{\mathrm {y}}}_{a b}= & {} - \frac{1}{2} \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}n_{\mathrm {x}}k_{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + \hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) . \end{aligned}$$
(10.47)

The mode equations then become

$$\begin{aligned} 0= & {} \mathcal {B}^{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + c^2_{\mathrm {x}}\hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) A^b_{\mathrm {x}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} + \hat{k}^{\mathrm {x}}_a u^{\mathrm {x}}_b\right) A^b_{\mathrm {y}}, \end{aligned}$$
(10.48)
$$\begin{aligned} 0= & {} \mathcal {B}^{\mathrm {y}}\left( \sigma _{\mathrm {y}}\perp ^{\mathrm {y}}_{a b} + c^2_{\mathrm {y}}\hat{k}^{\mathrm {y}}_a u^{\mathrm {y}}_b\right) A^b_{\mathrm {y}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left( \sigma _{\mathrm {y}}\perp ^{\mathrm {y}}_{a b} + \hat{k}^{\mathrm {y}}_a u^{\mathrm {y}}_b\right) A^b_{\mathrm {x}}. \end{aligned}$$
(10.49)

By contracting each with \(k_a\), using Eqs. (10.35) and (10.36), we get

$$\begin{aligned} 0= & {} \frac{1}{k_{\mathrm {x}}}\left\{ \mathcal {B}^{\mathrm {x}}\left( \sigma _{\mathrm {x}}\perp ^{\mathrm {x}}_{a b} k^a + c^2_{\mathrm {x}}k_{\mathrm {x}}u^{\mathrm {x}}_b\right) A^b_{\mathrm {x}}\right. \nonumber \\&\left. + \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left[ \sigma _{\mathrm {x}}k_a + k_{\mathrm {x}}\left( 1 - \sigma ^2_{\mathrm {x}}\right) u^{\mathrm {x}}_a\right] A^a_{\mathrm {y}}\right\} \nonumber \\= & {} \mathcal {B}^{\mathrm {x}}\left[ \frac{\sigma _{\mathrm {x}}}{\gamma _{{\mathrm {x}}{\mathrm {y}}} v_{{\mathrm {x}}{\mathrm {y}}}} \left( \frac{k_{\mathrm {y}}}{k_{\mathrm {x}}} - \gamma _{{\mathrm {x}}{\mathrm {y}}} \right) + c^2_{\mathrm {x}}\right] u^{\mathrm {x}}_a A^a_{\mathrm {x}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left( 1 - \sigma ^2_{\mathrm {x}}\right) \frac{k_{\mathrm {x}}}{k_{\mathrm {y}}} u^{\mathrm {y}}_a A^a_{\mathrm {y}}, \end{aligned}$$
(10.50)
$$\begin{aligned} 0= & {} \mathcal {B}^{\mathrm {y}}\left[ \frac{\sigma _{\mathrm {y}}}{\gamma _{{\mathrm {x}}{\mathrm {y}}} v_{{\mathrm {x}}{\mathrm {y}}}} \left( \frac{k_{\mathrm {x}}}{k_{\mathrm {y}}} - \gamma _{{\mathrm {x}}{\mathrm {y}}} \right) + c^2_{\mathrm {y}}\right] u^{\mathrm {y}}_a A^a_{\mathrm {y}}+ \mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}\left( 1 - \sigma ^2_{\mathrm {y}}\right) \frac{k_{\mathrm {y}}}{k_{\mathrm {x}}} u^{\mathrm {x}}_a A^a_{\mathrm {x}}. \end{aligned}$$
(10.51)

The dispersion relation now becomes

$$\begin{aligned} 0 = \left( \sigma ^2_{\mathrm {x}}- c^2_{\mathrm {x}}\right) \left( \sigma ^2_{\mathrm {y}}- c^2_{\mathrm {y}}\right) - \left( \frac{\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}}{\sqrt{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}}}\right) ^2 \left( 1 - \sigma ^2_{\mathrm {x}}\right) \left( 1 - \sigma ^2_{\mathrm {y}}\right) . \end{aligned}$$
(10.52)

This can be rewritten in a form more useful for numerical solutions; namely,

$$\begin{aligned} 0 = \left( x^2 - b^2\right) \left[ \left( x - y\right) ^2 - \left( 1 - c_{\mathrm {y}}^2 y x\right) ^2\right] - a^2 \frac{\left( 1 - c^2_{\mathrm {y}}x^2\right) ^2}{\gamma ^2_{{\mathrm {x}}{\mathrm {y}}}} , \end{aligned}$$
(10.53)

where \(x = \sigma _{\mathrm {x}}/c_{\mathrm {y}}\), \(y = v_{{\mathrm {x}}{\mathrm {y}}}/c_{\mathrm {y}}\), \(b = c_{\mathrm {x}}/c_{\mathrm {y}}\). and

$$\begin{aligned} a^2 = \left( \frac{\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}}{c^2_{\mathrm {y}}\sqrt{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}}}\right) ^2 . \end{aligned}$$
(10.54)

The immediate thing to note is that the relative speed changes the equation from a quadratic in \(\sigma ^2_{\mathrm {x}}\) to being fully quartic in \(\sigma _{\mathrm {x}}\); thus, it is inevitable that complex solutions will result. The question is if the imaginary contributions can be realized for physical parameters. Recall that this means the system must exhibit absolute stability and causality. Samuelsson et al. (2010) have shown that these are guaranteed when

$$\begin{aligned} 0 \le \left( \frac{\mathcal {A}^{{\mathrm {x}}{\mathrm {y}}}}{\sqrt{\mathcal {B}^{\mathrm {x}}\mathcal {B}^{\mathrm {y}}}}\right) ^2 \le c^2_{\mathrm {x}}c^2_{\mathrm {y}}\quad \Longrightarrow \quad a^2 \le b^2 . \end{aligned}$$
(10.55)

In the Newtonian limit the dispersion relation takes the same mathematical form for entrainment as it does for non-zero cross-constituent coupling; namely,

$$\begin{aligned} \frac{\left( x^2 - b^2\right) }{a^2} \left[ \left( x - y\right) ^2 - 1\right] = 1 . \end{aligned}$$
(10.56)

As this is quartic in x, the exact solutions are known. However, they are quite tedious and their main use is to serve as the basis for numerical evaluations of the modes. A basic algorithm would be to fix a and b, subject to the constraint in Eq. (10.55), and then evaluate the real and imaginary parts of \(\sigma _{\mathrm {x}}\) as functions of y. The end result of this process is to reveal that the instability exists in a “window” of y-values (Andersson et al. 2003, 2004b; Samuelsson et al. 2010). As an example we may consider the example from Andersson et al. (2004b), illustrated in Fig. 13. A more recent study (Andersson and Schmitt 2019), in the framework of relativity, highlights the fact that the system will be prone to an energy instability (closely related to the CFS instability from Sect. 7.4, as it sets in at the point where originally backwards moving modes are dragged forwards by the background flow). As indicated by the left panel of Fig. 13 this energy instability tends to set in before the system suffers the (dynamical) two-stream instability.

Fig. 13
figure 13

Image reproduced with permission from Andersson et al. (2004b), copyright by RAS

An illustration of the two-stream instability, showing the real (left panel) and imaginary (right panel) parts of the four roots of the dispersion relation for the model parameters (\(a^2=0.0249\) and \(b^2=0.0379\)) used in Andersson et al. (2004b). For these parameters the quartic dispersion relation has four real roots for both \(y=0\) and \(y=2\), while it has two real roots and a complex conjugate pair for y in the range \(0.6<y<1.5\). In this range, the two-stream instability is active

Finally, let us take the opportunity to note that the relativistic two-stream instability has also been analyzed in the non-linear regime (Hawke et al. 2013). This first nonlinear numerical simulation of the effect in relativistic multi-species hydrodynamical systems shows that the onset and initial growth of the instability match closely the results of linear perturbation theory. But, in the later stages of the evolution, the linear and nonlinear description have only qualitative overlaps. The main conclusion is that the instability does not saturate in the nonlinear regime by purely ideal hydrodynamic effects.

11 Numerical simulations: fluid dynamics in a live spacetime

Many astrophysical phenomena involve violent nonlinear matter dynamics. Such systems cannot (meaningfully) be described within perturbation theory. Instead, the modelling requires fully nonlinear—and multi-dimensional, given the lack of symmetry of (say) turbulent flows—simulations, taking into account the live spacetime of General Relativity. The last decades have seen considerable progress in the development of the relevant computational tools, especially for gravitational-wave sources like supernova core collapse (Müller 2016) and neutron star mergers (Baiotti and Rezzolla 2017). The state-of-the-art technology includes the consideration of fairly sophisticated matter models. In the case of supernova modelling, neutrinos are expected to play an important role in triggering the explosion (Janka 2012) and the role of magnetic fields may also be significant (Mösta et al. 2015). Meanwhile, for neutron star mergers, finite temperature effects are central as shock heating ramps up the temperature of the merged object to levels beyond that expected even during core collapse (see, e.g., Bauswein et al. 2010 or Kastaun and Galeazzi 2015). Magnetic fields are expected to have decisive impact on the post-merger dynamics are likely to leave an observational signature, e.g., in terms of short gamma-ray bursts (e.g., Kumar and Zhang 2015).

11.1 Spacetime foliation

We have already explored some aspects of the problem (like the thermodynamics and the matter equation of state, see Sect. 2) and we have considered features that arise in models of increasing complexity (in particular when we need to account for the relative flow of distinct fluid components). So far, the discussion has assumed a fibration of spacetime associated with a family of fluid observers. This approach is natural if one is mainly interested in the local fluid dynamics (e.g., wave propagation) and it also leads to the 1+3 formulation often used in cosmology (where “clocks” associated with the fluid observers define the notion of cosmic time), see Barrow et al. (2007) for a relevant discussion. The strategy is, however, not natural for numerical simulations with a live spacetime. Instead, most such work makes use of a 3+1 spacetime foliation (see Baumgarte and Shapiro 2003 for a relevant discussion), where progression towards the “future” is associated with a set of Eulerian observers. Hence, we need to understand how we extend the multifluid model from fibration to foliation.

The standard approach to numerical simulations takes as its starting point a “foliation”of spacetime into a family of spacelike hypersurfaces, \(\varSigma _t\), which arise as level surfaces of a scalar time t (see, e.g., Alcubierre 2008). Given the normal to this surface

$$\begin{aligned} N_a = - \alpha \nabla _ a t , \end{aligned}$$
(11.1)

where the function \(\alpha \) is known as the lapse, we have

$$\begin{aligned} N_a = (-\alpha ,0,0,0) , \end{aligned}$$
(11.2)

and the normalisation \(N_a N^a=-1\) (we are thinking of the normal as associated with an observer moving through spacetime in the usual way) leads to \(\alpha ^2 = -1/g^{tt}\). The sign in (11.1) ensures that time flows into the future. The dual to \(\nabla _a t\) leads to a time vector

$$\begin{aligned} t^a = \alpha N^a + \beta ^a , \end{aligned}$$
(11.3)

where the so-called shift vector \(\beta ^a\) is spatial, in the sense that \(N_a \beta ^a = 0\). It follows that

$$\begin{aligned} N^a = \alpha ^{-1} ( 1,-\beta ^i) , \end{aligned}$$
(11.4)

and the spacetime can be written in the Arnowitt–Deser–Misner (ADM) form (Arnowitt et al. 2008; York 1979):

$$\begin{aligned} ds^2 = - \alpha ^2 dt^2 + \gamma _{ij} \left( dx^i + \beta ^i dt \right) \left( dx^j + \beta ^j dt \right) , \end{aligned}$$
(11.5)

where the (induced) metric on the spacelike hypersurface is

$$\begin{aligned} \gamma _{ab} = g_{ab} + N_a N_b . \end{aligned}$$
(11.6)

Note that \(\gamma ^a_b\) represents the projection orthogonal to \(N_a\) and that \(\gamma _{ab}\) and its inverse can be used to raise and lower indices of purely spatial tensors. For example, we have \(\beta _i = \gamma _{ij} \beta ^j\).

In essence, the lapse \(\alpha \) determines the rate at which proper time advances from one time slice to the next, along the normal \(N_a\), while the vector \(\beta ^i\) determines how the coordinates shift from one spatial slice to the next. This is illustrated in Fig. 14. The two functions encode the coordinate freedom of General Relativity.

Fig. 14
figure 14

An illustration of the two formulations for the relativistic fluid problem. The fibration approach, which focuses on the worldline associated with a given fluid element (and a four velocity \(\varvec{u}\) with components \(u^a\)), provides a natural description of the microphysics and issues relating to thermodynamics. Meanwhile, a spacetime foliation, based on the use of spatial slices and normal observers (with the coordinate freedom encoded in the lapse \(\alpha \) and the shift vector \(\beta ^i\)), is typically used in numerical simulations. In order to ensure that the local physics is appropriately implemented in simulations, we need to understand the translation between the two descriptions

Reading off the metric from the line element, we have

$$\begin{aligned} g_{ab} = \left( \begin{array}{cc} -\alpha ^2 + \beta _i \beta ^i &{} \beta _i \\ \beta _i &{} \gamma _{ij} \end{array} \right) , \end{aligned}$$
(11.7)

with inverse

$$\begin{aligned} g^{ab} = \left( \begin{array}{cc} -1/\alpha ^2 &{} \beta ^i/\alpha ^2 \\ \beta ^i/\alpha ^2 &{} \gamma ^{ij}-\beta ^i \beta ^j /\alpha ^2 \end{array} \right) . \end{aligned}$$
(11.8)

Having specified the spacetime foliation, we can decompose any tensor into time and space components (adapting the logic from the discussion of the stress-energy tensor in Sect. 5). Suppose, for example, that we have a fluid associated with a four velocity \(u^a\). Then we can introduce the decompositionFootnote 18

$$\begin{aligned} u^a = W (N^a + \hat{v}^a) = {W\over \alpha } \left( t^a - \beta ^a + \alpha \hat{v}^a \right) , \end{aligned}$$
(11.9)

where \(N_a \hat{v}^a =0\) and the Lorentz factor is given by

$$\begin{aligned} W= - N_a u^a = \alpha u^t = (1-\hat{v}^2)^{-1/2} , \end{aligned}$$
(11.10)

where \(\hat{v}^2 = \gamma _{ij} \hat{v}^i \hat{v}^j\) and the last equality follows from \(u^a u_a=-1\), as usual. From this relation it is easy to see that

$$\begin{aligned} \hat{v}^t = 0 , \qquad \hat{v}^i = {u^i\over W} - N^i = {1\over \alpha } \left( {u^i \over u^t} + \beta ^i\right) , \end{aligned}$$
(11.11)

and it follows that

$$\begin{aligned} \hat{v}_t = g_{ta} \hat{v}^a = \beta _i \hat{v}^i , \qquad \hat{v}_i = \gamma _{ia} \hat{v}^a = {\gamma _{ij} \over \alpha } \left( {u^j\over u^t} + \beta ^j \right) . \end{aligned}$$
(11.12)

We also need to consider derivatives. First of all, we introduce a derivative associated with the hypersurface. Thus, we use the (totally) projected derivative

$$\begin{aligned} D_a = \gamma _a^b \nabla _b , \end{aligned}$$
(11.13)

where all free indices should be projected into the surface. This derivative is compatible with the spatial metric (see Sect. 3) in the sense that

$$\begin{aligned} D_a\gamma _{bc} = \gamma _a^d \gamma _b^e \gamma _c^f\nabla _d \gamma _{ef} = 0 , \end{aligned}$$
(11.14)

which means that it acts as a covariant derivative in the surface orthogonal to \(N^a\). The upshot of this is that we can construct a tensor algebra for the three-dimensional spatial slices. In particular, we can introduce a three-dimensional Riemann tensor. This projected Riemann tensor does not contain all the information from its four-dimensional cousin; the missing information is encoded in the extrinsic curvature, \(K_{ab}\). This is a symmetric spatial tensor, such that \(N^a K_{ab}=0\). The extrinsic curvature provides a measure of how the \(\varSigma _t\) surfaces curve relative to spacetime. In practice, we measure how the normal \(N_a\) changes as it is parallel transported along the hypersurface. That is, we defineFootnote 19

$$\begin{aligned} K_{ac} = -D_a N_c = - \gamma _a^b \gamma _c^d \nabla _b N_d = - \nabla _a N_c - N_a (N^b\nabla _b N_c) , \end{aligned}$$
(11.15)

where the second term is analogous to the fluid four-acceleration. We also have

$$\begin{aligned} K= K^a_a = g^{ab}K_{ab} = - \gamma ^{ab} D_a N_b = - \nabla _a N^a . \end{aligned}$$
(11.16)

Alternatively, we can use the properties of the Lie derivative to show that

$$\begin{aligned} K_{ij} = - {1 \over 2}\mathcal {L}_N \gamma _{ij} , \end{aligned}$$
(11.17)

but since

$$\begin{aligned} \mathcal {L}_N = {1\over \alpha } ( \mathcal {L}_t - \mathcal {L}_\beta ) = {1\over \alpha } ( \partial _t - \mathcal {L}_\beta ) , \end{aligned}$$
(11.18)

we have

$$\begin{aligned} \partial _t \gamma _{ij} = - 2\alpha K_{ij} + \mathcal {L}_\beta \gamma _{ij} . \end{aligned}$$
(11.19)

From the trace of this expression we get

$$\begin{aligned} \alpha K = - \partial _t \ln \gamma ^{1/2} + D_i \beta ^i , \end{aligned}$$
(11.20)

where \(\gamma =g^{ab}\gamma _{ab}\) and \(\gamma ^{ij} \partial _t \gamma _{ij} = \partial _t \ln \gamma \).

11.2 Perfect fluids

The spacetime foliation provides us with the tools we need to formulate relativistic fluid dynamics in a way suitable for numerical simulations (compatible with the solution of the Einstein field equations for the spacetime metric, which needs to be carried out in parallel; Alcubierre 2008; Baumgarte and Shapiro 2010). However, our immediate focus is on the equations of fluid dynamics (see Font 2008 for more details).

Let us start with the simple case of baryon number conservation. That is, we assume the flux \(n u^a\) is conserved, where n is the baryon number density according to an observer moving along with the fluid. Thus, we have

$$\begin{aligned} \nabla _a (n u^a) = \nabla _a [ Wn (N^a + \hat{v}^a) ]= 0 . \end{aligned}$$
(11.21)

First we note that the particle number density measured by the Eulerian observer is

$$\begin{aligned} \hat{n} =-N_a n u^a = nW , \end{aligned}$$
(11.22)

so we have

$$\begin{aligned} N^a \nabla _a \hat{n} + \nabla _i (\hat{n} \hat{v}^i) = - \hat{n} \nabla _a N^a = \hat{n} K , \end{aligned}$$
(11.23)

(since \(\hat{v}^i\) is spatial). Making use of the Lie derivative and (11.18) this can be written

$$\begin{aligned} N^a \nabla _a \hat{n} = \mathcal {L}_N \hat{n} = {1\over \alpha } ( \partial _t - \mathcal {L}_\beta ) \hat{n} = - \nabla _i (\hat{n} \hat{v}^i) + \hat{n} K , \end{aligned}$$
(11.24)

or

$$\begin{aligned} \partial _t \hat{n} + (\alpha \hat{v}^i - \beta ^i )\nabla _i \hat{n} + \alpha \hat{n} \nabla _i \hat{v}^i = \alpha \hat{n} K . \end{aligned}$$
(11.25)

Finally, since \(\hat{v}^i\) and \(\beta ^i\) are already spatial, we have

$$\begin{aligned} \partial _t \hat{n} + (\alpha \hat{v}^i - \beta ^i )D_i \hat{n} + \alpha \hat{n} D_i \hat{v}^i = \alpha \hat{n} K =- \hat{n} \partial _t \ln \gamma ^{1/2} + \hat{n} D_i \beta ^i , \end{aligned}$$
(11.26)

or

$$\begin{aligned} \partial _t \left( \gamma ^{1/2} \hat{n}\right) + D_i \left[ \gamma ^{1/2}\hat{n} (\alpha \hat{v}^i - \beta ^i )\right] = 0. \end{aligned}$$
(11.27)

This simply represents the advection of the baryons along the flow, as seen by an Eulerian observer. In arriving at this result, we have used the fact that

$$\begin{aligned} \left( -g\right) ^{1/2} = \alpha \gamma ^{1/2} , \end{aligned}$$
(11.28)

so

$$\begin{aligned} \nabla _a (-g)^{1/2} = \nabla _a ( \alpha \gamma ^{1/2}) = 0 . \end{aligned}$$
(11.29)

For future reference, it is also worth noting that

$$\begin{aligned} D_i \gamma ^{1/2} = \partial _i \gamma ^{1/2} - \varGamma ^j_{ji} \gamma ^{1/2} = 0 , \end{aligned}$$
(11.30)

where the Christoffel symbol is the one associated with the covariant derivative in the hypersurface.

figure o

Moving on, the fluid equations of motion follow from \(\nabla _a T^{ab}=0\), where we recall that a perfect fluid is described by the stress-energy tensor

$$\begin{aligned} T^{ab} = (p+\varepsilon ) u^a u^b + p g^{ab} . \end{aligned}$$
(11.32)

Here p and \(\varepsilon \) are the pressure and the energy density, respectively. As discussed in Sect. 2 these quantities are related by the equation of state, which encodes the relevant microphysics. In order to make contact with this discussion, a numerical simulation must allow us to extract these quantities from the evolved variables.

However, a numerical simulation is naturally carried out using quantities measured by the Eulerian observer. That is, we decompose the stress-energy tensor into normal and spatial parts as (again, see the discussion in Sect. 5)

$$\begin{aligned} T^{ab} = \rho N^a N^b + 2 N^{(a} S^{b)} + S^{ab} , \end{aligned}$$
(11.33)

with (noting the conflict in notation from the discussion in Sect. 11, where \(\rho \) represented the mass density)

$$\begin{aligned} \rho= & {} N_a N_b T^{ab} = \varepsilon W^2 - p \left( 1 - W^2\right) , \end{aligned}$$
(11.34)
$$\begin{aligned} S^i= & {} - \gamma ^i_c N_d T^{cd} = \left( p+\varepsilon \right) W^2 \hat{v}^i , \end{aligned}$$
(11.35)

and

$$\begin{aligned} S^{ij} = \gamma ^i_c \gamma ^j_d T^{cd} = p \gamma ^{ij} + \left( p +\varepsilon \right) W^2 \hat{v}^i \hat{v}^j . \end{aligned}$$
(11.36)

A projection of the equations of motion along \(N_a\) then leads to the energy equation. From

$$\begin{aligned} N^a \nabla _a \rho + \rho \nabla _a N^a + \nabla _ a S^a - N_b N^a \nabla _a S^b - N_b \nabla _a S^{ab} = 0 , \end{aligned}$$
(11.37)

we get

$$\begin{aligned} N^a \nabla _a \rho + \nabla _a S^a= \rho K-S^b N^a\nabla _a N_b - S^{ab}\nabla _a N_b , \end{aligned}$$
(11.38)

where we have used

$$\begin{aligned} N^a\nabla _a N_b = D_b \ln \alpha \end{aligned}$$
(11.39)

We also have

$$\begin{aligned} {1\over \alpha } \left( \partial _t - \mathcal {L}_\beta \right) \rho + \nabla _a S^a= \rho K-S^b D_b \ln \alpha + S^{ab}K_{ab} , \end{aligned}$$
(11.40)

leading to

$$\begin{aligned} \partial _t \left( \gamma ^{1/2} \rho \right) + D_i \left[ \gamma ^{1/2} \left( \alpha S^i -\rho \beta ^i\right) \right] = \gamma ^{1/2} \left( \alpha S^{ij}K_{ij} -S^i D_i \alpha \right) . \end{aligned}$$
(11.41)
figure p

Turning to the momentum equation, which is obtained by a projection orthogonal to \(N_a\), we have

$$\begin{aligned} \rho N^a \nabla _a N^c + \gamma ^c_{\ b}N^a \nabla _a S^b + S^c \nabla _a N^a + S^a \nabla _a N^c + \gamma ^c_{\ b} \nabla _a S^{ab} =0 , \end{aligned}$$
(11.42)

which leads to

$$\begin{aligned} \left( \partial _t - \mathcal {L}_\beta \right) S_i - S^j \left( \partial _t - \mathcal {L}_\beta \right) \gamma _{ij} - \alpha K S_i + \rho D_i \alpha + \alpha \gamma _{ij} D_k S^{kj} = 0 , \end{aligned}$$
(11.43)

where we have used

$$\begin{aligned} N^a \nabla _a S^c = \mathcal {L}_N S^c + S^a \nabla _a N^c = \mathcal {L}_N S^c - S^a K_a^c . \end{aligned}$$
(11.44)

This leads to the final result

$$\begin{aligned} \partial _t (\gamma ^{1/2} S_i) + D_j \left[ \gamma ^{1/2} \left( \alpha S_i^j -S_i \beta ^j \right) \right] = \gamma ^{1/2} \left( S_j D_i \beta ^j - \rho D_i \alpha \right) . \end{aligned}$$
(11.45)

This completes the set of equations we need in order to carry out a perfect fluid simulation. The extension to more general setting follows, at least formally, the same steps.

11.3 Conservative to primitive

We have written down the set of evolution equations we need for a single-component problem. This leaves us with one important issue to resolve. How do we connect the evolution to the underlying microphysics and the equation of state? In order to do this, we have to consider the inversion from the variables used in the evolution to the “primitive” fluid variables associated with the equation of state.

Let us, in the interest of conceptual clarity, focus on the case of a cold barotropic fluid, such that the equation of state provides the energy as a function of the baryon number density \(\varepsilon = \varepsilon (n)\) (see Sect. 2). This then leads to the chemical potential

$$\begin{aligned} \mu = {d\varepsilon \over dn} , \end{aligned}$$
(11.46)

and the pressure p follows from the thermodynamical relation:

$$\begin{aligned} p = n \mu - \varepsilon . \end{aligned}$$
(11.47)

We see that, in order to connect with the thermodynamics we need the evolved number density. We also need to decide which observer measures equation of state quantities. In the single-fluid case this question is relatively easy to answer; we need to express the equation of state in the fluid frame (use the fibration associated with \(u^a\)).

In the simple case we consider here the evolved system, (11.27) and (11.45), provides (assuming that \(\gamma ^{1/2}\) is known from the evolution of the Einstein equations)

$$\begin{aligned} \hat{n} = nW = n (1-\hat{v}^2)^{-1/2} , \end{aligned}$$
(11.48)

and

$$\begin{aligned} S^i = (p+\varepsilon ) W^2 \hat{v}^i . \end{aligned}$$
(11.49)

We need to invert these two relations to extract the primitive variables, n and \(\hat{v}^i\). This can be formulated as a one-dimensional root-finding problem. For example, we may start by guessing a value for \(n=\bar{n}\). This then allows us to work out \(\varepsilon \) from the equation of state and p from (11.47). With these variables in hand we can solve

$$\begin{aligned} {S^2 \over (p+ \varepsilon )^2} = W^4 \hat{v}^2 , \quad \text{ with } \quad S^2 = \gamma _{ij}S^i S^j , \end{aligned}$$
(11.50)

for \(\hat{v}^2\). This, in turn, allows us to work out the Lorentz factor W and then \(\hat{v}^i\) follows from (11.49). Finally, we get \(n=\hat{n}/W\) from (11.48). The result can be compared to our initial guess \(\bar{n}\). Iterating the procedure gives a solution consistent with the conserved quantities, and hence all primitive quantities.

Unfortunately, the numerical implementation of this strategy may not be as straightforward as it sounds. For example, the result may be sensitive to the initial guess and the algorithm may not converge. This is particularly true for more complex situations (e.g., multi-parameter equations of state or problems involving magnetic fields; Font 2000; Dionysopoulou et al. 2013). However, our aim here is not to resolve the possible numerical issues. We are only outlining the logic of the approach.

11.4 The state of the art

Without attempting an exhaustive survey of the relevant literature, it is useful to provide comments on the current state of the art along with suggestions for further reading. The area of numerical simulations of general relativistic fluids is developing rapidly, stimulated by the breakthrough discoveries in gravitational-wave astronomy—in particular, the astonishing GW170817 neutron star binary merger event (Abbott et al. 2017c, b), observations of which engaged a large fraction of the global astronomy community.

Focus on nonlinear simulations with a live spacetime, one may identify (at least) four (more or less) separate bodies of work:

  • First of all, numerical simulations have been used to explore the problem of instabilities in rotating stars and disks. This is a classic problem in applied mathematics/fluid dynamics, where perturbative studies may be used to establish the existence of an instability (for simpler models) but where numerical simulations are required for a higher level of realism and also to investigate the nonlinear evolution of an unstable system (to what extent the nonlinear coupling of different oscillation models leads to an instability saturating at some level, etcetera). The archetypal problems—basically because they involve instabilities that grow sufficiently rapidly that they can be tracked by (expensive) multi-dimensional simulations—are the bar-mode instability of (rapidly and differentially) rotating stars (Tohline et al. 1985; Williams and Tohline 1987; New et al. 2000; Shibata et al. 2000; Baiotti et al. 2007) and the run-away instability of (thick) accretion disks (Zanotti et al. 2003).

  • A second setting that has been explored since the early days of numerical relativity (Stark and Piran 1985; Piran and Stark 1986) involve the gravitational collapse to form a black hole (Baiotti et al. 2005; Ott et al. 2007, 2011). The typical collapse time-scale is short enough that these simulations can be carried out without extortionate cost, but the problem involves a number of complicating issues relating to the formation of the black-hole horizon. The typical set-up involves initial data representing a stable fluid body from which pressure support is artificially removed to trigger the collapse. The main conclusion drawn from this body of work may be that the gravitational-wave signal from collapse and black-hole formation tends to be dominated by quasinormal mode ringing.

  • Realistic modelling of the core-collapse of star that reaches the endpoint of its main-sequence life is exceedingly complicated (Janka et al. 2007; Morozova et al. 2018). The problem involves complex physics and a vast range of scales that need to be accurately tracked in a simulation. In spite of the challenges, there has been huge progress on understanding the problem in the last two decades. From the fluid dynamics point of view, the main developments involve the implementation of a (more) realistic matter description (based on nuclear physics and accounting for thermal effect; Richers et al. 2017) and developments towards an accurate implementation of neutrinos (Roberts et al. 2016; Andresen et al. 2017; Glas et al. 2019; Endrizzi et al. 2020). The latter is crucial, as the neutrinos are thought to be necessary to trigger the supernova explosion.

  • The final problem setting—attracting a lot of interest at the present time (Baiotti and Rezzolla 2017; Bernuzzi 2020)—involves the inspiral and merger of binary neutron stars. Many of the challenges, regarding the physics, are the same as in the case of core-collapse simulations. The problem involves a vast range of scale, not so much involved with an explosion as the outflow of matter that is unbound during the merger, undergoes rapid nuclear reactions and give rise to a kilonova signal (Goriely et al. 2011; Bauswein et al. 2012; Kasen et al. 2015; Radice et al. 2018; Margalit and Metzger 2019). At the same time the hot merger remnant oscillates wildly (Stergioulas et al. 2011; Bernuzzi et al. 2015; Rezzolla and Takami 2016) until it loses enough angular momentum (or cools enough) that it (most likely) collapses to form a black hole. An important additional complication involves the presence of magnetic fields (Palenzuela et al. 2009), hugely relevant as neutron star mergers are expected to be the source of observed short gamma-ray bursts (Rezzolla et al. 2011; Paschalidis et al. 2015). This connection was observationally confirmed by the GW170817 event, but numerical simulations have not yet reached the stage where the detailed engine of of these events can be explored (Ciolfi 2020).

12 Relativistic elasticity

Shortly after a neutron star is born, the outer layers freeze to form an elastic crust and the temperature of the high-density core drops below the level where superfluid and superconducting components are expected to be present. The different phases of matter impact on the observations in a number of ways. The crust is important as

  • it anchors the star’s magnetic field (and provides dissipative channels leading to the gradual field evolution; Viganò et al. (2013)),

  • there is an immediate connection between observed quasi-periodic oscillations in the tails of magnetar flares (Strohmayer and Watts 2005) (see Watts et al. 2016 for an overview the relevant literature) and the dynamics of the elastic nuclear lattice. An understanding of the properties of the crust is essential for efforts to match the theory to observed seismology. The idea of associating observed variability with torsional oscillation of the crust was first put forward by Duncan (1998). Relativistic aspects of the problem (particularly relevant for the present discussion) were developed by Samuelsson and Andersson (2007, 2009).

  • the ability of the crust to sustain elastic strain is key to the formation of asymmetries which may lead to detectable gravitational waves from a mature spinning neutron star. Continuous gravitational-wave searches with the LIGO-Virgo network of interferometers is beginning to set interesting upper limits for such signals for a number of known pulsars (Abbott et al. 2017a), in some instances reaching significantly below the expected maximum “mountain” size estimated from state of the art molecular dynamics simulations of the crustal breaking strain (Horowitz and Kadau 2009; Johnson-McDaniel and Owen 2013) (see Baiko and Chugunov (2018) for an alternative estimate, and note that while the model of a lattice of point-like ions may apply to the outer crust, the situation in the inner crust—with a superfluid component and possible pasta phases—is much less clear).

In essence, the elastic properties of the crust are crucial for an understanding of neutron-star phenomenology. In order for such models to reach the required level of realism we must consider the problem in the context of General Relativity. Interestingly, relativistic elasticity turns out to represent a (more or less) natural extension of the variational framework, with the key step involving the structure of matter space.

12.1 The matter space metric

The modern view of elasticity (Carter and Quintana 1972, 1975a, b; Kijowski and Magli 1992, 1997; Beig and Schmidt 2003a, b; Carter et al. 2006a) relies on comparing the actual matter configuration to an unstrained/relaxed reference shape (see Carter and Chachoua 2006; Carter and Samuelsson 2006 for discussions of how the problem changes when an interpretating superfluid component is present, as in the inner crust of a neutron star). In order to keep track of the reference state relative to which the strain is measured, we introduce a positive definite and symmetric tensor field, \(k_{a b}\) (Karlovini and Samuelsson 2003). The geometric meaning of this object is quite intuitive; it encodes the (three-)geometry of the solid (as seen by the solid itself). We will mostly cite key results from the extant literature about the properties of \(k_{a b}\); in particular, for the discussion that follows, the Appendix of Andersson et al. (2019) may be the most relevant.

From the point of view of the variational framework, the tensor \(k_{a b}\) is similar to \(n_{a b c}\) in the sense that it is flow-line orthogonal (Carter and Quintana 1972)

$$\begin{aligned} u^a k_{a b} = 0 . \end{aligned}$$
(12.1)

The main properties of \(k_{a b}\) are established by introducing the corresponding matter space object, \(k_{A B} (= k_{B A})\), via the usual map:

$$\begin{aligned} k_{a b} = \psi ^A_a \psi ^B_b k_{A B} . \end{aligned}$$
(12.2)

The tensor \(k_{A B}\) is “fixed” on matter space, in the same sense as \(n_{A B C}\), because it is (assumed to be) a function of its own matter space coordinates \(X^A\) only. The associated volume form is \(n_{A B C}\) (see Andersson et al. 2019). If we introduce

$$\begin{aligned} g^{A B} = \psi ^A_a \psi ^B_b g^{a b} = \psi ^A_a \psi ^B_b \perp ^{a b} , \end{aligned}$$
(12.3)

as before, and use Eqs. (6.5) and (6.10), then we can show thatFootnote 20

$$\begin{aligned} n^2 = - g_{a b} n^a n^b = \frac{1}{3!} \det {\left( k_{A B}\right) } \det {\left( g^{A B}\right) } . \end{aligned}$$
(12.4)

Moreover, using the relations (6.13) and (12.2), we can easily establish that the Lagrangian variation of \(k_{a b}\) vanishes. That is, we have

$$\begin{aligned} \delta k_{a b} = - {\mathcal {L}}_\xi k_{a b} \quad \Longrightarrow \quad \varDelta k_{a b} = 0 . \end{aligned}$$
(12.5)

Finally, since \(u^a \psi ^A_a = 0\), and \(k_{A B}\) is a function of \(X^A\), we have

$$\begin{aligned} {\mathcal {L}}_u k_{A B} = u^a \psi ^C_a \frac{\partial k_{A B}}{\partial X^C} = 0 , \end{aligned}$$
(12.6)

and it follows that

$$\begin{aligned} {\mathcal {L}}_u k_{a b}= & {} k_{A B} {\mathcal {L}}_u \left( \psi ^A_a \psi ^B_b \right) \nonumber \\= & {} k_{A B} \left[ u^c \frac{\partial }{\partial x^c} \left( \psi ^A_a \psi ^B_b\right) + \psi ^A_c \psi ^B_b \frac{\partial u^c}{\partial x^a} + \psi ^A_a \psi ^B_c \frac{\partial u^c}{\partial x^b}\right] \nonumber \\= & {} k_{A B} u^c \left[ \frac{\partial ^2 X^A}{\partial x^c \partial x^a} \psi ^B_b + \psi ^A_a \frac{\partial ^2 X^B}{\partial x^c \partial x^b} - \frac{\partial ^2 X^A}{\partial x^a \partial x^c} \psi ^B_b - \psi ^A_a \frac{\partial ^2 X^B}{\partial x^b \partial x^c}\right] = 0 . \end{aligned}$$
(12.7)

Following Karlovini and Samuelsson (2003) we now introduce the matter space tensor \(\eta _{A B}\) to quantify the unsheared state. Its defining characteristic is that it is the inverse to \(g^{A B}\) but only for the relaxed configuration (when the energy density \(\varepsilon = \check{\varepsilon }\), using a check to indicate the reference shape from now on):

$$\begin{aligned} g^{A C} \eta _{C B} = \delta ^A_B , \quad \varepsilon = \check{\varepsilon } . \end{aligned}$$
(12.8)

If we introduce

$$\begin{aligned} \epsilon ^{A B C} = \psi ^A_a \psi ^B_b \psi ^C_c u_d \epsilon ^{d a b c} , \end{aligned}$$
(12.9)

then it follows from (6.10) that

$$\begin{aligned} n_{A B C} = n \epsilon _{A B C} . \end{aligned}$$
(12.10)

In other words,

$$\begin{aligned} \epsilon _{A B C} = \sqrt{\det {\left( \eta _{A B}\right) }} \left[ A \ B \ C\right] . \end{aligned}$$
(12.11)

The tensor \(\eta _{AB}\) is useful because it provides us with a straightforward way to model conformal elastic deformations. Specifically, if f is the conformal factor, we let

$$\begin{aligned} k_{A B} = f \eta _{A B} \quad \Longrightarrow \quad \det {\left( k_{A B}\right) } = f^3 \det {\left( \eta _{A B}\right) } . \end{aligned}$$
(12.12)

But,

$$\begin{aligned} n_{A B C} = \sqrt{ \det {\left( k_{A B}\right) }} \left[ A \ B \ C\right] = n \epsilon _{A B C} = n \sqrt{\det {\left( \eta _{A B}\right) }} \left[ A \ B \ C\right] , \end{aligned}$$
(12.13)

which shows that \(f = n^{2/3}\). This demonstrates that k (a suitably defined 3D determinant of \(k_{a b}\), see Andersson et al. 2019) is such that \(k = n^2\) (Karlovini and Samuelsson 2003), even though \(k_{a b}\) does not itself depend on the number density.

figure q

12.2 Elastic variations

Let us now consider the variational derivation of the equations of motion for an elastic system. First of all, the fact that the Lagrangian variation of \(k_{a b}\) vanishes means that \(k_{a b}\), in addition to being a natural quantity for describing the elastic configuration, is useful in the development of Lagrangian perturbation theory.

Letting the Lagrangian \(\varLambda \) depend also on the new tensor (in essence, incorporating the energy associated with elastic strain) we have

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right) = \sqrt{- g} \left[ \mu _a \delta n^a + \left( \frac{1}{2}\varLambda g^{a b} + {\partial \varLambda \over \partial g_{ab}} \right) \delta g_{a b} + {\partial \varLambda \over \partial k_{ab} }\delta k_{ab} \right] . \end{aligned}$$
(12.14)

We proceed as in Sect. 6 and replace \(\delta n^a\) with the Lagrangian displacement \(\xi ^a\). In addition, it follows from (12.5) that

$$\begin{aligned} \delta k_{ab} = - \xi ^d \nabla _d k_{ab} - k_{d b} \nabla _a \xi ^d - k_{a d} \nabla _b \xi ^d . \end{aligned}$$
(12.15)

Again ignoring surface terms, we have (as \(k_{ab}\) is symmetric)

$$\begin{aligned} {\partial \varLambda \over \partial k_{ab} }\delta k_{ab} = \xi ^a \left[ 2 \nabla _b \left( {\partial \varLambda \over \partial k_{bd} } k_{a d}\right) - {\partial \varLambda \over \partial k_{bd} }\nabla _a k_{b d} \right] . \end{aligned}$$
(12.16)

Making use of this result, we arrive at

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right) = \sqrt{- g} \left\{ \left[ \frac{1}{2}\left( \varLambda - n^d \mu _d\right) g^{a b} + {\partial \varLambda \over \partial g_{ab}} \right] \delta g_{a b} + \tilde{f}_a \xi ^a \right\} , \end{aligned}$$
(12.17)

where

$$\begin{aligned} \tilde{f}_a = 2 n^b \nabla _{[a}\mu _{b]} + 2 \nabla _b \left( {\partial \varLambda \over \partial k_{bd} } k_{a d} \right) - {\partial \varLambda \over \partial k_{bd} }\nabla _a k_{b d} = 0 . \end{aligned}$$
(12.18)

As in the fluid case, this result provides the equations of motion for the system. However, we need to do a bit of work in order to get the result into a more user-friendly form. To start with, we read off the stress-energy tensor from (12.17):

$$\begin{aligned} T^{ab} = \left( \varLambda - n^d \mu _d\right) g^{a b} + 2 {\partial \varLambda \over \partial g_{ab}} . \end{aligned}$$
(12.19)

The next step involves giving physical meaning to \(k_{ab}\). This involves quantifying the deviation of a given state from the relaxed configuration. This is where the additional matter space tensor \(\eta _{A B}\) comes into play (Karlovini and Samuelsson 2003). This object depends on n, and relates directly to the relaxed state, see (12.8). Its spacetime counterpart is

$$\begin{aligned} \eta _{a b} = \psi ^A_a \psi ^B_b \eta _{A B} . \end{aligned}$$
(12.20)

and we have already seen that

$$\begin{aligned} \eta _{a b} = n^{- 2/3} k_{a b} . \end{aligned}$$
(12.21)

This relation is important, as we have already established that \(k_{ab}\) is a fixed matter space tensor.

Let us now imagine that the system evolves away from the relaxed state. This means that (12.8) no longer holds: \(\eta _{AB}\) retains the value set by the initial state, but \(g^{AB}\) evolves along with the spacetime. This leads to the build up of elastic strain, simply quantified in terms of the strain tensor

$$\begin{aligned} s_{a b} = {1\over 2} ( \perp _{a b} - \eta _{a b}) = {1\over 2} \left( \perp _{a b} - n^{-2/3} k_{a b} \right) . \end{aligned}$$
(12.22)

In the relaxed configuration, we have \(\eta _{ab} = \perp _{ab}\) by construction so it is obvious that \(s_{ab}\) vanishes.

This model is fairly intuitive, but in practice it is more natural to work with scalars formed from \(\eta _{ab}\) (which can be viewed as “invariant”). This helps make the model less abstract. Hence, we introduce the strain scalar \(s^2\) (not to be confused with the entropy density from before) as a suitable combination of the invariants of \(\eta _{ab}\):

$$\begin{aligned} I_1 = \eta ^a_{\ a} = g^{A B} \eta _{A B} , \end{aligned}$$
(12.23)
$$\begin{aligned} I_2 = \eta ^a_{\ b} \eta ^b_{\ a} = g^{A D} g^{B E} \eta _{E A} \eta _{D B} , \end{aligned}$$
(12.24)
$$\begin{aligned} I_3 = \eta ^a_{\ b} \eta ^b_{\ d} \eta ^d_{\ a} = g^{A E} g^{B F} g^{D G} \eta _{E B} \eta _{F D} \eta _{G A} . \end{aligned}$$
(12.25)

However, the number density n also can be seen to be a combination of invariants, since

$$\begin{aligned} k = n^2 = {1\over 3!} \left( I_1^3 - 3 I_1I_2+2I_3 \right) . \end{aligned}$$
(12.26)

Given this, it makes sense to replace one of the \(I_N\) (\(N=1, 2, 3\)) with n, which now becomes one of the required invariants. Then we define \(s^2\) to be a function of two of the other invariants. We can choose different combinations, but we must ensure that \(s^2\) vanishes for the relaxed state. For example, Karlovini and Samuelsson (2003) work with

$$\begin{aligned} s^2 = {1\over 36} \left( I_1^3- I_3-24 \right) . \end{aligned}$$
(12.27)

In the limit \(\eta _{ab} \rightarrow \perp _{ab}\) we have \(I_1 , I_3 \rightarrow 3\) and we see that the combination for \(s^2\) in Eq. (12.27) vanishes.

Next, we assume that the Lagrangian of the system depends on \(s^2\), rather than the tensor \(k_{ab}\). In doing this, we need to keep in mind that Eqs. (12.21) and (12.25) show that the invariants \(I_N\) depend on n (and hence both \(n^a\) and \(g_{ab}\)) as well as \(k_{ab}\).

So far, the description is nonlinear, but in most situations of astrophysical interest it should be sufficient to consider a slightly deformed configuration.Footnote 21 In effect, we may focus on a Hookean model, such that

$$\begin{aligned} \varLambda = - \check{\varepsilon }(n) - \check{\mu }(n) s^2 = - \varepsilon , \end{aligned}$$
(12.28)

where \(\check{\mu }\) is usual shear modulus, associated with a linear stress-strain relation for small deviations away from the relaxed state. (It should not to be confused with the chemical potential!) As mentioned earlier, the checks indicate that quantities are calculated for the unstrained state, with the specific understanding that \(s^2=0\), and it should be apparent from (12.28) that we have an expansion in (a supposedly small) \(s^2\).

Since the strain scalar is given in terms of invariants, as in (12.27), it might be tempting to suggest a change of variables such that \(s^2=s^2(I_1,I_3)\). Our final equations of motion will, indeed, reflect this, but it would be premature to make the change at this point. Instead we note that the momentum is now given by

$$\begin{aligned} \mu _a= & {} {\partial \varLambda \over \partial n^a} = {\partial n^2 \over \partial n^a} {\partial \varLambda \over \partial n^2} \nonumber \\= & {} - {1 \over n} {\partial \varLambda \over \partial n} g_{ab}n^b = {1\over n} \left( {d \check{\varepsilon } \over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) g_{ab}n^b , \end{aligned}$$
(12.29)

while

$$\begin{aligned} {\partial \varLambda \over \partial g_{ab} }=- \left( {d\check{\varepsilon } \over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) {\partial n \over \partial g_{ab}} - \check{\mu } {\partial s^2 \over \partial g_{ab}} . \end{aligned}$$
(12.30)

Here we need (note that \(n^a\) is held fixed in the partial derivative)

$$\begin{aligned} {\partial n \over \partial g_{ab}} = - {1\over 2n} n^a n^b , \end{aligned}$$
(12.31)

and it is useful to note that

$$\begin{aligned} {\partial s^2 \over \partial g_{ab}} = - g^{ad} g^{be} {\partial s^2 \over \partial g^{de}} . \end{aligned}$$
(12.32)

Also, when working out this derivative, we need to hold n fixed [as is clear from (12.30)]. At the end of the day, we have for the stress-energy tensor

$$\begin{aligned} T^{ab}= & {} \left[ \varLambda + n \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) \right] g^{a b} \nonumber \\&+ {1\over n} \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) n^a n^b +2 \check{\mu } g^{ad} g^{be} {\partial s^2 \over \partial g^{de}} \nonumber \\= & {} \varLambda g^{ab} + n \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 + \check{\mu } {\partial s^2 \over \partial n} \right) h^{ab} + 2 \check{\mu } g^{ad} g^{be} {\partial s^2 \over \partial g^{de}} . \end{aligned}$$
(12.33)

Let us now make the change of variables we hinted at previously. In order to establish the procedure, let us consider a situation where \(s^2\) depends only on \(I_1\). Then we need

$$\begin{aligned}&I_1 = \eta ^a_{\ a} = n^{-2/3} g^{ab} k_{ab} , \end{aligned}$$
(12.34)
$$\begin{aligned}&\left( {\partial s^2 \over \partial n}\right) _1= - {2I_1 \over 3n} { \partial s^2 \over \partial I_1 } , \end{aligned}$$
(12.35)
$$\begin{aligned}&\left( {\partial \varLambda \over \partial k_{ab}}\right) _1 = - \check{\mu } {\partial s^2 \over \partial k_{ab} } = - \check{\mu } n^{-2/3} g^{ab} { \partial s^2 \over \partial I_1 } , \end{aligned}$$
(12.36)

(recall the comment on the partial derivative from before) and

$$\begin{aligned} \left( {\partial s^2 \over \partial g^{de}}\right) _1 = { \partial s^2 \over \partial I_1 }\eta _{de} . \end{aligned}$$
(12.37)

Making use of these results, we readily find

$$\begin{aligned} T^{ab}= & {} -\varepsilon g^{ab} + n \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 \right) \perp ^{ab} + 2 \check{\mu } { \partial s^2 \over \partial I_1 } \left( \eta ^{ab} - {1\over 3} I_1 \perp ^{ab} \right) \nonumber \\= & {} -\varepsilon g^{ab} + n \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 \right) \perp ^{ab} + 2 \check{\mu } { \partial s^2 \over \partial I_1 } \eta ^{\langle ab \rangle } , \end{aligned}$$
(12.38)

where the \(\langle \ldots \rangle \) brackets indicate the symmetric, trace-free part of a tensor with two free indices. In our case, we have

$$\begin{aligned} \eta _{\langle ab \rangle } = \eta _{(ab)} - { 1 \over 3} \eta ^d_{\ d} \perp _{ab} . \end{aligned}$$
(12.39)

Comparing this result to the standard decomposition of the stress-energy tensor,

$$\begin{aligned} T^{ab} = \varepsilon u^a u^b + \bar{p} \perp ^{ab} + \pi ^{ab}, \qquad \text{ where } \qquad \pi ^a_{\ a} = 0 , \end{aligned}$$
(12.40)

and \(\bar{p}\) is the isotropic pressure (which differs from the fluid pressure, p, as it accounts for the elastic contribution). We see that elasticity introduces an anisotropic contribution

$$\begin{aligned} \pi ^1_{ab} = 2 \check{\mu } {\partial s^2 \over \partial I_1} \eta _{\langle ab \rangle } . \end{aligned}$$
(12.41)

Following the same steps for the other two invariants (see Andersson et al. 2019 for details), \(I_2\) and \(I_3\), we find that

$$\begin{aligned} \pi ^2_{ab} = 4 \check{\mu } {\partial s^2 \over \partial I_2} \eta _{d \langle a} \eta _{b \rangle }^{\ d} , \end{aligned}$$
(12.42)

and

$$\begin{aligned} \pi ^3_{ab} = 6 \check{\mu } {\partial s^2 \over \partial I_3} \eta ^{d e} \eta _{d \langle a} \eta _{b \rangle e} , \end{aligned}$$
(12.43)

respectively. Combining these results with (12.27), we have

$$\begin{aligned} \pi _{ab} = \sum _N \pi ^N_{ab} = {\check{\mu }\over 6} \left[ \left( \eta ^d_{\ d}\right) ^2 \eta _{\langle ab\rangle }- \eta ^{d e} \eta _{d \langle a} \eta _{b\rangle e}\right] , \end{aligned}$$
(12.44)

which agrees with equation (128) from Karlovini and Samuelsson (2003).

Now consider the final stress-energy tensor. Note first of all that, if we consider n and \(s^2\) as the independent variables of the energy functional, then the isotropic pressure should follow from

$$\begin{aligned} \bar{p} = n \left( {\partial \varepsilon \over \partial n} \right) _{s^2} - \varepsilon = \check{p} + \left( \frac{n}{\check{\mu }} {d\check{\mu }\over dn} -1 \right) {\check{\mu }} s^2 , \end{aligned}$$
(12.45)

where

$$\begin{aligned} \check{p} = n {d\check{\varepsilon }\over dn} - \check{\varepsilon }, \end{aligned}$$
(12.46)

is identical to the fluid pressure from before. However, we may also introduce a corresponding momentum, such that

$$\begin{aligned} \bar{\mu }_a = - \left( {\partial \varLambda \over \partial n^a} \right) _{s^2} = \left( {d\check{\varepsilon }\over dn} + {d\check{\mu }\over dn} s^2 \right) n_a, \end{aligned}$$
(12.47)

which leads to

$$\begin{aligned} \bar{p} = \varLambda - n^a \bar{\mu }_a = \check{p}+ \left( {n \over \check{\mu }} {d \check{\mu }\over dn} - 1 \right) \check{\mu }s^2 . \end{aligned}$$
(12.48)

Finally, in order to obtain the equations of motion for the system we can either take the divergence of (12.40) or return to (12.18) and make use of our various definitions. The results are the same (as they have to be). After a little bit of work we find that (12.18) leads to

$$\begin{aligned} 2n^b\nabla _{[b}\bar{\mu }_{a]} + \perp _a^d \left( \nabla ^b \pi _{b d} - \check{\mu }\nabla _d s^2\right) = 0 , \end{aligned}$$
(12.49)

where it is worth noting that the combination in the parentheses is automatically flow line orthogonal.

12.3 Lagrangian perturbations of an unstrained medium

Many applications of astrophysical interest—ranging from neutron star oscillations to tidal deformations in binary systems and mountains on spinning neutron stars—are adequately modelled within perturbation theory. As should be clear from the development of the elastic model, this requires the use of a Lagrangian framework. Luckily, we have already done most of the work needed to consider this problem. In particular, we know that

$$\begin{aligned} \varDelta k_{ab} = 0 . \end{aligned}$$
(12.50)

We now make maximal use of this fact.

If we assume that the background configuration is relaxed, i.e. that \(s^2=0\) vanishes for the configuration we are perturbing with respect to, then the fluid results from Sect. 6 together with (12.50) make the elastic perturbation problem straightforward (although it still involves some algebra).

Consider, first of all, the strain scalar. A few simple steps lead to

$$\begin{aligned} \varDelta s^2 = 0 . \end{aligned}$$
(12.51)

To see this, recall that \(s^2\) is a function of the invariants, \(I_N\). Express these in terms of the number density n, the spacetime metric and \(k_{ab}\). Once this is done, make use of (12.50) and the fact that the background is unstrained, i.e. \(\eta _{ab} = \perp _{ab}\), to see that \(\varDelta I_N=0\), which makes intuitive sense. Since the strain scalar is quadratic, linear perturbations away from a relaxed configuration should vanish. An important implication of this result is that the last term in (12.49) does not contribute to the perturbed equations of motion.

This leads to

$$\begin{aligned} \varDelta \eta _{ab} = {1\over 3} \eta _{ab} \perp ^{de} \varDelta g_{de} , \end{aligned}$$
(12.52)

and

$$\begin{aligned} \varDelta \eta ^{ab} = \left[ - 2 g^{a(e} \eta ^{d)b} + {1\over 3} \eta ^{ab} \perp ^{de} \right] \varDelta g_{de} . \end{aligned}$$
(12.53)

It then follows from (12.22) and (12.44), that

$$\begin{aligned} \varDelta \pi _{ab} = - 2 \check{\mu }\varDelta s_{ab} , \end{aligned}$$
(12.54)

where

$$\begin{aligned} 2 \varDelta s_{ab} = \left( \perp ^e_{\ a} \perp ^d_{\ b} - \frac{1}{3} \perp _{ab} \perp ^{de} \right) \varDelta g_{de} . \end{aligned}$$
(12.55)

It is worth noting that the final result for an isotropic material agrees with, for example, Schumaker and Thorne (1983) where the relevant strain term is simply added to the stress-energy tensor (without detailed justification).

Next, let us consider the perturbed equations of motion. In the case of an unstrained background, it is easy to see that the argument that led to (7.79) still holds. This gives us the perturbation of the first term in (12.49) (after replacing \(\mu _a\rightarrow \bar{\mu }_a\)). Similarly, since \(\pi _{ab}\) vanishes in the background, the Lagrangian variation commutes with the covariant derivative in the second term. Thus, we end up with a perturbation equation of form

$$\begin{aligned} 2n^a\nabla _{[a} \varDelta \bar{\mu }_{b]} + \nabla ^a \varDelta \pi _{ab} = 0 . \end{aligned}$$
(12.56)

This is the final result, but in order to arrive at an explicit expression for the perturbed momentum, it is useful to note that

$$\begin{aligned} \varDelta \mu _a = - {1 \over 2n} \check{\beta }u_a \perp ^{{b} d} \varDelta g_{{b} d} + \mu \left( \delta _a^{{b}} u^d + { 1 \over 2} u_a u^{{b}} u^d \right) \varDelta g_{{b} d} , \end{aligned}$$
(12.57)

where we have defined the bulk modulus \(\check{\beta }\) as

$$\begin{aligned} \check{\beta }= n {d\check{p}\over dn} = (\check{p}+ \check{\varepsilon }) {d\check{p}\over d \check{\varepsilon }} = (\check{p}+ \check{\varepsilon }) \check{C}^2_s , \end{aligned}$$
(12.58)

\(\check{C}^2_s\) is the sound speed in the elastic medium and we have used the fundamental relation \(\check{p}+ \check{\varepsilon } = n \mu \). It also follows that

$$\begin{aligned} \varDelta p = - {\check{\beta } \over 2} \perp ^{ab} \varDelta g_{ab} . \end{aligned}$$
(12.59)

When we consider perturbations of an elastic medium we need to pay careful attention to the magnitude of the deviation away from the relaxed state. If the perturbation is too large, the material will yield (Horowitz and Kadau 2009). It may fracture or behave in some other fashion that is not appropriately described by the equations of perfect elasticity. We need to quantify the associated breaking strain. In applications involving neutron stars, this is important if we want to consider star quakes in a spinning down pulsar, establish to what extent crust quakes in a magnetar lead to the observed flares (Watts et al. 2016) and whether the crust breaks due to the tidal interaction in an inspiralling binary (Strohmayer and Watts 2005; Penner et al. 2012; Tsang et al. 2012). A commonly used criterion to discuss elastic yield strains in engineering involves the von Mises stress, defined as

$$\begin{aligned} \varTheta _{\mathrm {vM}} = \sqrt{\frac{3}{2} s_{ab}s^{ab}} \end{aligned}$$
(12.60)

When this scalar exceeds some critical value \(\varTheta _{\mathrm {vM}} > \varTheta ^{\mathrm {crit}}_{\mathrm {vM}}\), say, the material no longer behaves elastically. In order to work out the dominant contribution to the von Mises stress in general we need to (at least formally) consider second order perturbation theory (Andersson et al. 2019), but in the simple case of an unstrained background we have

$$\begin{aligned} \varTheta _{\mathrm {vM}} = \sqrt{\frac{3}{2} \varDelta s_{ab} \varDelta s^{ab}} = \sqrt{\frac{3}{8} \perp ^{a\langle c}\perp ^{d\rangle b}\varDelta g_{ab}\varDelta g_{cd}} \end{aligned}$$
(12.61)

This allows us to quantify when a strained crust reaches the point of failure. This allows us to work out the maximal deformation, but unfortunately it is difficult to model what happens beyond this point. The same is true for terrestrial materials.

13 Superfluidity

Low temperature physics continues to be a vibrant area of research, providing interesting and exciting challenges, many of which are associated with the properties of superfluids/superconductors. Basically, matter appears to have two options when the temperature decreases towards absolute zero. According to classical physics one would expect the atoms in a liquid to slow down and come to rest, forming a crystalline structure. It is, however, possible that quantum effects become relevant before the liquid solidifies, leading to the formation of a superfluid condensate (a quantum liquid). This will only happen if the interaction between the atoms is attractive and relatively weak. The archetypal superfluid system is Helium. It is well established that \(^4\)He exhibits superfluidity below \(T=2.17\) K. Above this temperature liquid Helium is accurately described by the Navier-Stokes equations. Below the critical temperature the modelling of superfluid \(^4\)He requires a “two-fluid” description. Two fluid degrees of freedom are required to explain, for example, “clamped” flow through narrow capillaries and the presence of a second sound (associated with heat flow).

Many other low temperature systems are known to exhibit superfluid properties. The different phases of \(^3\)He have been well studied, both theoretically and experimentally, and there is considerable current interest in atomic Bose–Einstein condensates. The relevance of superfluid dynamics reaches beyond systems that are accessible in the laboratory. It is generally expected that neutron stars will contain a number of superfluid phases. This expectation is natural given the extreme core density (reaching several times the nuclear saturation density) and low temperature (compared to the nuclear scale of the Fermi temperatures of the different constituents, about \(10^{12}~\hbox {K}\)) of these stars.

The rapid spin-up and subsequent relaxation associated with radio pulsar glitches provides strong, albeit indirect, evidence for neutron-star superfluidity (Haskell and Sedrakian 2018). The standard model for these events is based on, in the first instance, the pinning of superfluid vortices (e.g., to the crust lattice) which allows a rotational lag to build up between the superfluid and the part of the star that spins down electromagnetically, and secondly the sudden unpinning which transfers angular momentum from one component to the other, leading to the observed spin-change. Recent observations of the youngest known neutron star in the galaxy, the compact object in the Cassiopeia A supernova remnant, with an estimated age of around 330 years, are also relevant in this context. The cooling of this objects seems to accord with our understanding of neutron stars with a superfluid component in the core (Page et al. 2011; Shternin et al. 2011). The idea remains somewhat controversial—see, for example Elshamouty et al. (2013), Posselt et al. (2013), Ho et al. (2015) and Posselt and Pavlov (2018)— but in principle the data can be used to infer the pairing gap for neutron superfluidity in the core, which helps constrain current theory. Similarly, the slow thermal relaxation observed in neutron stars that enter quiescence at the end of an accretion phase requires a superfluid component to be present in the neutron star crust (Wijnands et al. 2017).

Basically, neutron star astrophysics provides ample motivation for us to develop a relativistic description of superfluid systems. At one level this turns out to be straightforward, given the general variational multi-fluid model. However, when we consider the fine print we uncover a number of hard physics questions. In particular, we need to make contact with microphysics calculations that determine the various parameters of the relevant multi-fluid systems. We also need to understand how to incorporate quantized vortices (Barenghi et al. 2001), and the associated mutual friction, in the relativistic context. In order to establish the proper context for the discussion, it makes sense to first discuss the multi-fluid approach to Newtonian superfluids. We do this for the particular case of Helium, the archetypal laboratory two-fluid system.

13.1 Bose–Einstein condensates

In order to understand the key aspects of the connection between the fluid model and the underlying quantum system, it is natural to consider the problem of a single component Bose–Einstein condensate. In recent years there has been a virtual explosion of interest in such systems. A key reason for this is that atomic condensates lend themselves to precision experiments, allowing researchers to probe the nature of the associated macroscopic quantum behaviour (Pethick and Smith 2008) In addition, from the relativity point of view, the description of Bose–Einstein condensates is relevant as it connects with issues that may play a role in cosmology (Sikivie and Yang 2009; Harko 2011).

On a sufficiently large scale, atomic condensatesFootnote 22 are accurately represented by a fluid model, similar to that used for superfluid Helium (described below). Consider as an example a uniform Bose gas, in a volume V, with an effective (long-range) interaction energy \(U_0\). The relevant interaction arises in the Born approximation, and is related to the s-wave scattering length a through

$$\begin{aligned} U_0 = {4\pi \hbar ^2 a\over m}, \end{aligned}$$
(13.1)

where m is the atomic mass. This effectively means that the model is appropriate only for dilute gases, where short-range corrections to the interaction can be ignored. In essence, we are focussing on the long-wavelength behaviour. Given the interaction, the energy of a state with N bosons (recalling that we need to multiply by the number of ways that these can be arranged in pairs) is

$$\begin{aligned} E = {N(N-1)\over 2} {U_0\over V} \approx {N^2\over 2} {U_0\over V} = {1\over 2} n^2 V U_0 , \end{aligned}$$
(13.2)

where we have defined the number density \(n=N/V\). From this we see that the chemical potential is

$$\begin{aligned} \mu = {dE\over dN } = {N\over V} U_0 = n U_0 . \end{aligned}$$
(13.3)

Alternatively, we may work with the energy density

$$\begin{aligned} \varepsilon = {E \over V} \qquad \Longrightarrow \qquad \mu = {d\varepsilon \over dn} , \end{aligned}$$
(13.4)

as in Sect. 2. From the usual thermodynamical relation we see that the pressure of the system follows from

$$\begin{aligned} dp = n d\mu . \end{aligned}$$
(13.5)

The main theoretical tool for studying the dynamics of atomic Bose–Einstein condensates is the Gross–Pitaevskii equation. This equation, which takes the form

$$\begin{aligned} - {\hbar ^2 \over 2m} \nabla ^2 \varPsi +V_\mathrm {ext} \varPsi + U_0 |\varPsi |^2\varPsi = i \hbar \partial _t\varPsi , \end{aligned}$$
(13.6)

encodes the dependence of the order parameter \(\varPsi \) (note that this is not the many-body quantum wave-function) on the interaction \(U_0\) and an external potential \(V_\mathrm {ext}\). In laboratory systems the external potential usually represents an optical trap. In an astrophysical setting it can be taken as a proxy for the coupling to the gravitational field.

At low temperatures (such that we can ignore thermal excitations) the order parameter is normalized in such a way that the density of the condensate equals the density of the gas

$$\begin{aligned} |\varPsi |^2 = n. \end{aligned}$$
(13.7)

With this identification, we may consider the simplest problem; the stationary solution to (13.6), representing the ground state of the system. Letting the time dependence be of form \(\varPsi =\varPsi _0 \exp (-i\mu t/\hbar ) \) we see that a uniform, stationary solution corresponds to

$$\begin{aligned} \mu = n U_0+V_\mathrm {ext} . \end{aligned}$$
(13.8)

Moving on to the time-dependent dynamics, we note that (13.6) describes a complex-valued function \(\varPsi \). In effect, there are two degrees of freedom to consider. Given the connection to n it is useful to consider the magnitude of \(\varPsi \). Multiplying (13.6) with \(\varPsi ^*\) (where the asterisk represents complex conjugation) and subtracting the result from its own complex conjugate, we readily arrive at

$$\begin{aligned} \partial _t |\varPsi |^2 + {\hbar \over 2mi} \nabla _i \left( \varPsi ^* \nabla ^i \varPsi - \varPsi \nabla ^i \varPsi ^* \right) = 0 . \end{aligned}$$
(13.9)

Comparing this result with the continuity equation, we see that the two take the same form provided that we identify (in analogy with the momentum operator in quantum mechanics) the velocity

$$\begin{aligned} v^i = {p^i \over mi} = {\hbar \over 2mi} {1\over |\varPsi |^2} \left( \varPsi ^* \nabla ^i \varPsi - \varPsi \nabla ^i \varPsi ^* \right) . \end{aligned}$$
(13.10)

In other words, we have

$$\begin{aligned} \partial _t n + \nabla _i \left( n v^i \right) = 0 . \end{aligned}$$
(13.11)

Having already made use of the magnitude, it makes sense to let the second degree of freedom in the problem be represented by the phase of \(\varPsi \). Letting \(\varPsi = \sqrt{n} \exp (iS)\) we can write the real part of (13.6) as

$$\begin{aligned} -\hbar \partial _t S = \mu + V_\mathrm {ext} + {mv^2 \over 2}- {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} . \end{aligned}$$
(13.12)

Here we have identified the chemical potential as before. We have also used

$$\begin{aligned} {\hbar ^2 \over 2m} (\nabla _i S) (\nabla ^i S) = {mv^2 \over 2} , \end{aligned}$$
(13.13)

which follows from (13.10). Finally, we take the gradient of (13.12) to get

$$\begin{aligned}&m \partial _t v_i + \nabla _i \left[ \mu + V_\mathrm {ext} + {mv^2 \over 2}- {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} \right] \nonumber \\&\quad = m \left( \partial _t +v^j \nabla _j \right) v_i + \nabla _i \left( \mu + V_\mathrm {ext} \right) \nonumber \\&\qquad +\,m \epsilon _{ijk} v^j \left( \epsilon ^{klm} \nabla _l v_m\right) - \nabla _i \left( {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} \right) = 0 . \end{aligned}$$
(13.14)

By definition, the flow is potential and hence irrotational (at least as long as we ignore quantum vortices, which we consider later), so

$$\begin{aligned} m \left( \partial _t +v^j \nabla _j \right) v_i + \nabla _i \left( \mu + V_\mathrm {ext} \right) - \nabla _i \left( {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} \right) = 0 . \end{aligned}$$
(13.15)

Comparing to the standard fluid result, we see that only the last term differs. Notably, it is also the only term that (explicitly) retains the quantum origins of the model (Planck’s constant!).

So far, we have not made any simplifications. The two Eqs. (13.11) and (13.15) contain the same information as the Gross–Pitaevskii equation (13.6). The equations differ from those for irrotational fluid flow only by the presence of the final term in (13.15). This term, which represents a “quantum pressure” is, however, irrelevant as long as we focus on the large-scale dynamics. To see this, assume that the order parameter varies on some length-scale L. It then follows that

$$\begin{aligned} \nabla \mu \sim {nU_0 \over L} \qquad \text{ and } \qquad \nabla \left( {\hbar ^2 \over 2m} {1\over \sqrt{n}} \nabla ^2 \sqrt{n} \right) \sim {\hbar ^2 \over mL^3} . \end{aligned}$$
(13.16)

In other words, the quantum pressure can be neglected as long as

$$\begin{aligned} {\hbar ^2 \over mn L^2 U_0} \ll 1 . \end{aligned}$$
(13.17)

In order to give this relation a clearer meaning, we introduce the coherence length \(\xi \), roughly the length-scale on which the kinetic energy balances the pressure. This leads to

$$\begin{aligned} {\hbar ^2 \over 2m \xi ^2 } \approx n U_0 , \end{aligned}$$
(13.18)

and we can neglect the quantum pressure as long as

$$\begin{aligned} \left( {\xi \over L} \right) ^2 \ll 1 . \end{aligned}$$
(13.19)

As long as this condition is satisfied, a low temperature Bose–Einstein condensate is faithfully represented by a fluid model. In the atomic condensate literature this regime is sometimes referred to as the Thomas–Fermi limit. It is worth noting that, even though the above condition implies that the fluid model is appropriate on larger scales, it is fundamentally not the same averaging argument that leads to the notion of a fluid element in the usual discussion. In the case of quantum condensates, the fluid model may in fact be appropriate at much shorter scales since it tends to be the case that the coherence length is vastly smaller than the mean-free path of the various particles that make up a normal “fluid”. This scale enters the quantum problem once we consider finite temperature excitations, being relevant for the second component that then comes into play.

figure r

13.2 Helium: the original two-fluid model

Phenomenologically, the behaviour of superfluid Helium is “easy” to understand if one first considers a system at absolute zero temperature. Then the dynamics is entirely due to the quantum condensate (as in the previous example). There exists a single quantum wavefunction, and the momentum of the flow follows directly from the gradient of its phase. This immediately implies that the flow is irrotational. At finite temperatures, one must also account for thermal excitations (like phonons). A second dynamical degree of freedom arises since the excitation gas may drift relative to the atoms. In the standard two-fluid model, one makes a distinction between a “normal” fluid componentFootnote 23 and a superfluid part. The identification of the associated densities is to a large extent “statistical” as one cannot physically separate the “normal” component from the “superfluid” one. It is important to keep this in mind.

We take as our starting point the Newtonian version of the multi-fluid framework. We consider the simplest conducting system corresponding to a single particle species exhibiting superfluidity. Such systems typically have two degrees of freedom, c.f. \(\mathrm {He}^4\) (Putterman 1974; Tilley and Tilley 1990) where the entropy can flow independently of the superfluid Helium atoms. Superfluid \(\mathrm {He}^3\) can also be included in the mixture, in which case there will be a relative flow of the \(\mathrm {He}^3\) isotope with respect to \(\mathrm {He}^4\), and relative flows of each with respect to the entropy (Vollhardt and Wölfle 2002). The model we advocate here distinguishes the atoms from the massless “entropy”—the former will be identified by a constituent index \(\mathrm {n}\), while the latter is represented by \(\mathrm {s}\). As this description is different (in spirit) from the standard two-fluid model for Helium, it is relevant to explain how the two descriptions are related.

First of all, we need to allow for a difference in the two three-velocities

$$\begin{aligned} w_i^{\mathrm {y}\mathrm {x}} = {v}_{i}^{\mathrm {y}} - {v}_{i}^{\mathrm {x}} , \quad {\mathrm {y}}\ne {\mathrm {x}}. \end{aligned}$$
(13.20)

Letting the square of this difference be given by \(w^2\), the equation of state then takes the form \(\mathcal{E} = \mathcal{E}(n_{\mathrm {n}},n_{\mathrm {s}},w^2)\). Hence, we have

$$\begin{aligned} {d} \mathcal{E} = \mu ^{\mathrm {n}} \, {d} n_{\mathrm {n}} + \mu ^{\mathrm {s}} \, {d} n_{\mathrm {s}} + \alpha \, {d} w^2, \end{aligned}$$
(13.21)

where

$$\begin{aligned} \mu ^{\mathrm {n}} = \left. \frac{\partial \mathcal {E}}{\partial n_{\mathrm n}} \right| _{n_{\mathrm {s}},w^2}, \qquad \mu ^{\mathrm {s}} = \left. \frac{\partial \mathcal {E}}{\partial n_{\mathrm s}} \right| _{n_{\mathrm {s}},w^2}, \qquad \alpha = \left. \frac{\partial \mathcal {E}}{\partial w^2} \right| _{n_{\mathrm {s}},n_{\mathrm {s}}}. \end{aligned}$$
(13.22)

The \(\alpha \) coefficient reflects the effect of entrainment on the equation of state. Similarly, entrainment causes the fluid momenta to be modified to

$$\begin{aligned} \frac{p^\mathrm {x}_i}{m^\mathrm {x}} = v^i_\mathrm {x}+ 2 \frac{\alpha }{\rho _\mathrm {x}} w^i_{\mathrm {y}\mathrm {x}}. \end{aligned}$$
(13.23)

The number density of each fluid obeys a continuity equation:

$$\begin{aligned} \frac{\partial n_{\mathrm {x}}}{\partial t} + \nabla _{j} (n_{\mathrm {x}} v_{\mathrm {x}}^{j}) = 0. \end{aligned}$$
(13.24)

Each fluid also satisfies an Euler-type equation, which ensures the conservation of total momentum. This equation can be written

$$\begin{aligned} \left( \frac{\partial }{\partial t} + {v}^{j}_{\mathrm {x}}\nabla _{j} \right) \left[ {v}_{i}^{\mathrm {x}} + \varepsilon _{\mathrm {x}} w_i^{\mathrm {y}\mathrm {x}} \right] + \nabla _{i} (\varPhi + \tilde{\mu }_{\mathrm {x}}) + \varepsilon _{\mathrm {x}} w_j^{\mathrm {y}\mathrm {x}} \nabla _{i} v^{j}_{\mathrm {x}} = 0, \end{aligned}$$
(13.25)

where

$$\begin{aligned} \tilde{\mu }_\mathrm {x}= \frac{\mu ^\mathrm {x}}{m^\mathrm {x}} , \end{aligned}$$
(13.26)

and the entrainment is now included via the coefficients

$$\begin{aligned} \varepsilon _\mathrm {x}= 2 \rho _\mathrm {x}\alpha . \end{aligned}$$
(13.27)

For a detailed discussion of these equations, see Prix (2004) and Andersson and Comer (2006).

We have already seen that the entrainment means that each momentum does not have to be parallel to the associated flux. In the case of a two-component system, with a single species of particle flowing with \(n^\mathrm {n}_i = n v^\mathrm {n}_i\) and a massless entropy with flux \(n^\mathrm {s}_i = sv^\mathrm {s}_i\) (i.e., letting \(n_\mathrm {n}=n\) and \(n_s=s\), where n is the particle number density and s represents the entropy per unit volume), the momentum densities are

$$\begin{aligned} \pi _i^\mathrm {n}= n p_i^\mathrm {n}= mn v_i^\mathrm {n}- 2 \alpha w_i^{\mathrm {n}\mathrm {s}} , \end{aligned}$$
(13.28)

and

$$\begin{aligned} \pi ^\mathrm {s}_i = s p^\mathrm {s}_i = 2 \alpha w_i^{\mathrm {n}\mathrm {s}} . \end{aligned}$$
(13.29)
figure s

In order to understand the physical relevance of the entrainment better, let us compare the two-fluid model to the orthodox model used to describe laboratory superfluids. This also clarifies the dynamical role of the thermal excitations in the system.

Expressed in terms of the momentum densities, the two momentum equations can be written, cf. (13.25),

$$\begin{aligned} \partial _t \pi _i^\mathrm {n}+ \nabla _j \left( v_\mathrm {n}^j \pi _i^\mathrm {n}\right) + n \nabla _i \left( \mu _\mathrm {n}- \frac{1}{2} m v_\mathrm {n}^2 \right) + \pi _j^\mathrm {n}\nabla _i v_\mathrm {n}^j = 0 , \end{aligned}$$
(13.30)

and

$$\begin{aligned} \partial _t \pi _i^\mathrm {s}+ \nabla _j \left( v_\mathrm {s}^j \pi _i^\mathrm {s}\right) + s \nabla _i T + \pi _j^\mathrm {s}\nabla _i v_\mathrm {s}^j = 0 , \end{aligned}$$
(13.31)

where we have used the fact that the temperature follows from \(\mu _\mathrm {s}= T\). Let us now assume that we are considering a superfluid system. For low temperatures and velocities the fluid described by (13.30) should be irrotational. In order to impose this constraint we need to appreciate that it is the momentum that is quantized in a rotating superfluid, not the velocity. This means that we require

$$\begin{aligned} \epsilon ^{klm} \nabla _l p^\mathrm {n}_m = 0 . \end{aligned}$$
(13.32)

To see how this affects the equations of motion, we rewrite (13.30) as

$$\begin{aligned} n \partial _t p_i^\mathrm {n}+ n \nabla _i \left[ \mu _\mathrm {n}- \frac{m}{2} v_\mathrm {n}^2 + v_\mathrm {n}^j p_j^\mathrm {n}\right] - n \epsilon _{ijk} v_\mathrm {n}^j (\epsilon ^{klm} \nabla _l p^\mathrm {n}_m) = 0 \end{aligned}$$
(13.33)

Using (13.32) we have

$$\begin{aligned} \partial _t p_i^\mathrm {n}+ \nabla _i \left[ \mu _\mathrm {n}- \frac{m}{2} v_\mathrm {n}^2 + v_\mathrm {n}^j p_j^\mathrm {n}\right] = 0 . \end{aligned}$$
(13.34)

We now have all the expressions we need to make a direct comparison with the standard two-fluid model for Helium.

It is natural to begin by identifying the drift velocity of the quasiparticle excitations in the two models. After all, this is the variable that leads to the “two-fluid” dynamics. Moreover, since it distinguishes the part of the flow that is affected by friction it has a natural physical interpretation. In the standard two-fluid model this velocity, \(v_{\mathrm {N}}^i\), is associated with the “normal fluid” component. In the variational framework, the excitations are directly associated with the entropy of the system, which flows with \(v_\mathrm {s}^i\). These two quantities should be the same, and hence we have

$$\begin{aligned} v_{\mathrm {N}}^i = v_\mathrm {s}^i . \end{aligned}$$
(13.35)

The second fluid component, the “superfluid”, is usually associated with a “velocity” \(v_{\mathrm {S}}^i\). This quantity is directly linked to the gradient of the phase of the superfluid condensate wave function. This means that it is, in fact, a rescaled momentum. This means that we should identify

$$\begin{aligned} v_{\mathrm {S}}^i = \frac{\pi ^i_\mathrm {n}}{\rho _\mathrm {n}} = \frac{p_\mathrm {n}^i}{m} . \end{aligned}$$
(13.36)

These identifications lead to

$$\begin{aligned} \rho v_{\mathrm {S}}^i = \rho \left[ \left( 1 - \varepsilon \right) v_\mathrm {n}^i + \varepsilon v_{\mathrm {N}}^i \right] , \end{aligned}$$
(13.37)

where \(\varepsilon = 2\alpha /\rho \) and \(\rho \) is the total mass density. We see that the total mass current is

$$\begin{aligned} \rho v_\mathrm {n}^i = \frac{\rho }{1 - \varepsilon } v_{\mathrm {S}}^i - \frac{\varepsilon \rho }{1 - \varepsilon } v_{\mathrm {N}}^i . \end{aligned}$$
(13.38)

If we introduce the superfluid and normal fluid densities,

$$\begin{aligned} \rho _{\mathrm {S}}= \frac{\rho }{1 - \varepsilon } , \qquad \text{ and } \qquad \rho _{\mathrm {N}}= - \frac{\varepsilon \rho }{1 - \varepsilon } , \end{aligned}$$
(13.39)

we arrive at the usual result (Khalatnikov 1965; Putterman 1974)

$$\begin{aligned} \rho v_\mathrm {n}^i = \rho _{\mathrm {S}}v_{\mathrm {S}}^i + \rho _{\mathrm {N}}v_{\mathrm {N}}^i . \end{aligned}$$
(13.40)

Obviously, it is the case that \(\rho = \rho _{\mathrm {S}}+ \rho _{\mathrm {N}}\). This completes the translation between the two formalisms. Comparing the two descriptions, it is clear that the variational approach has identified the natural physical variables—the average drift velocity of the excitations and the total momentum flux. Since the system can be “weighed” the total density \(\rho \) also has a clear interpretation. Moreover, the variational derivation identifies the truly conserved fluxes. In contrast, the standard model uses quantities that only have a statistical meaning. The density \(\rho _{\mathrm {N}}\) is inferred from the mean drift momentum of the excitations. That is, there is no “group” of excitations that can be identified with this density. Since the superfluid density \(\rho _{\mathrm {S}}\) is inferred from \(\rho _{\mathrm {S}}= \rho -\rho _{\mathrm {N}}\), it is a statistical concept, as well. Furthermore, the two velocities, \(v_{\mathrm {N}}^i\) and \(v_{\mathrm {S}}^i\), are not individually associated with a conservation law. From a practical point of view, this is not a problem. The various quantities can be calculated from microscopic theory and the results are known to compare well to experiments. At the end of the day, the two descriptions are (as far as applications are concerned) identical and the preference of one over the other is very much a matter of taste (or convention).

The above results show that the entropy entrainment coefficient follows from the “normal fluid” density according to

$$\begin{aligned} \alpha = - \frac{\rho _{\mathrm {N}}}{2} \left( 1 - \frac{\rho _{\mathrm {N}}}{\rho } \right) ^{-1} . \end{aligned}$$
(13.41)

This shows that the entrainment coefficient diverges as the temperature increases towards the superfluid transition and \(\rho _{\mathrm {N}}\rightarrow \rho \). At first sight, this may seem an unpleasant feature of the model. However, it is simply a manifestation of the fact that the two fluids must lock together as one passes through the phase transition. The model remains non-singular as long as \(v_i^\mathrm {n}\) approaches \(v_i^\mathrm {s}\) sufficiently fast as the critical temperature is approached. More detailed discussions of entrainment and finite temperature superfluids can be found in Andersson et al. (2013), Gusakov and Andersson (2006), Kantor and Gusakov (2011), Gusakov et al. (2009), Gusakov and Haensel (2005), Leinson (2018), Dommes et al. (2020) and Rau and Wasserman (2020).

Having related the main variables, let us consider the form of the equations of motion. We start with the inviscid problem. It is common to work with the total momentum. Thus, we combine (13.30) and (13.31) to get

$$\begin{aligned}&\partial _t \left( \pi _i^\mathrm {n}+ \pi _i^\mathrm {s}\right) + \nabla _l \left( v_\mathrm {n}^l \pi ^\mathrm {n}_i + v_\mathrm {s}^l \pi _i^\mathrm {s}\right) + n \nabla _i \mu _\mathrm {n}+ s \nabla _i T \nonumber \\&\quad - n \nabla _i \left( \frac{1}{2} m v_\mathrm {n}^2 \right) + \pi _l^\mathrm {n}\nabla _i v_\mathrm {n}^l + \pi _l^\mathrm {s}\nabla _i v_\mathrm {s}^l = 0 . \end{aligned}$$
(13.42)

Here we have

$$\begin{aligned} \pi _i^\mathrm {n}+ \pi _i^\mathrm {s}= \rho v_i^\mathrm {n}\equiv j_i \end{aligned}$$
(13.43)

which defines the total momentum density. From the continuity equations (13.24) we see that

$$\begin{aligned} \partial _t \rho + \nabla _i j^i = 0 . \end{aligned}$$
(13.44)

The pressure \(\varPsi \) follows from

$$\begin{aligned} \nabla _i \varPsi = n \nabla _i \mu _\mathrm {n}+ s \nabla _i T - \alpha \nabla _i w_{\mathrm {n}\mathrm {s}}^2 , \end{aligned}$$
(13.45)

and we also need the relation

$$\begin{aligned} v_n^l \pi _i^\mathrm {n}+ v_\mathrm {s}^l \pi _i^\mathrm {s}= v^{\mathrm {S}}_i j^l + v_{\mathrm {N}}^l j^0_i , \end{aligned}$$
(13.46)

where we have defined

$$\begin{aligned} j^0_i = \rho _{\mathrm {N}}(v_i^{\mathrm {N}}- v_i^{\mathrm {S}}) = \pi _i^\mathrm {s}, \end{aligned}$$
(13.47)

and

$$\begin{aligned} \pi _l^\mathrm {n}\nabla _i v_\mathrm {n}^l + \pi _l^\mathrm {s}\nabla _i v_\mathrm {s}^l = n \nabla _i \left( \frac{1}{2} m v_\mathrm {n}^2 \right) - 2 \alpha w_l^{\mathrm {n}\mathrm {s}} \nabla _i w^l _{\mathrm {n}\mathrm {s}} . \end{aligned}$$
(13.48)

Putting all the pieces together we have

$$\begin{aligned} \partial _t j_i + \nabla _l \left( v_i^{\mathrm {S}}j^l + v_{\mathrm {N}}^l j^0_i\right) + \nabla _i \varPsi = 0 . \end{aligned}$$
(13.49)

The second equation of motion follows directly from (13.34);

$$\begin{aligned} \partial _t v_i^{\mathrm {S}}+ \nabla _i \left( \tilde{\mu }_{\mathrm {S}}+ \frac{1}{2} v_{\mathrm {S}}^2 \right) = 0 , \end{aligned}$$
(13.50)

where we have defined

$$\begin{aligned} \tilde{\mu }_{\mathrm {S}}= \frac{1}{m} \mu _\mathrm {n}- \frac{1}{2} \left( v_\mathrm {n}^i - v_{\mathrm {S}}^i\right) ^2 . \end{aligned}$$
(13.51)

The above relations show that our inviscid equations of motion are identical to the standard ones (Khalatnikov 1965; Putterman 1974). The identified relations between the different variables also provide a direct way to translate the quantities in the two descriptions. For example, we can write down a generalized first law, starting from (13.21). The key point is that we have demonstrated how the “normal fluid density” corresponds to the entropy entrainment in the variational model. This clarifies the role of the entropy entrainment; a quantity that arises in a natural way within the variational framework.

13.3 Relativistic models

Neutron star physics provides ample motivation for the need to develop a relativistic description of superfluid systems. As the typical core temperatures (below \(10^8~\mathrm {K}\)) are far below the Fermi temperature of the various constituents (of the order of \(10^{12}~\mathrm {K}\) for baryonsFootnote 24) mature neutron stars are extremely cold on the nuclear temperature scale. This means that—just like ordinary matter at near absolute zero temperature—the matter in the star will most likely freeze to a solid or become superfluid. While the outer parts of the star, the so-called crust, form an elastic lattice, the inner parts of the star are expected to be superfluid. In practice, this means that we must be able to model mixtures of superfluid neutrons and superconducting protons. It is also likely that we need to understand superfluid hyperons and colour superconducting quarks. There are many hard physics questions that need to be considered if we are to make progress in this area. In particular, we need to make contact with microphysics calculations that determine parameters of such multi-fluid systems.

One of the key features of a pure superfluid is that it is irrotational. On a larger scale, bulk rotation is mimicked by the formation of vortices, slim “tornadoes” representing regions where the superfluid degeneracy is broken (Barenghi et al. 2001). In practice, this means that one would often, e.g., when modelling global neutron star oscillations, consider a macroscopic model based on “averaging” over a large number of vortices. The resulting model closely resembles the standard fluid model. Of course, it is important to remember that the vortices are present on the microscopic scale and that they may affect the parameters in the problem. There are also unique effects that are due to the vortices, e.g., the mutual friction that is thought to be the key agent that counteracts relative rotation between the neutrons and protons in a superfluid neutron star core (Mendell 1991b).

For the present discussion, let us focus on the case of superfluid \(\mathrm {He}^4\). We then have two fluids, the superfluid Helium atoms with particle number density \(n_\mathrm {n}\) and the entropy with particle number density \(n_\mathrm {s}\), as before. From the derivation in Sect. 9 we know that the equations of motion can be written

$$\begin{aligned} \nabla _a n_{\mathrm {x}}^a = 0 , \end{aligned}$$
(13.52)

and

$$\begin{aligned} n_{\mathrm {x}}^b \nabla _{[b} \mu ^{\mathrm {x}}_{a]} = 0. \end{aligned}$$
(13.53)

To make contact with other discussions of the superfluid problem (Carter and Khalatnikov 1992, 1994; Carter and Langlois 1995a, 1998), we will use the notation \(s^a= n_\mathrm {s}^a\) and \(\varTheta _a = \mu _a^\mathrm {s}\). Then the equations that govern the motion of the entropy become

$$\begin{aligned} \nabla _a s^a = 0 \qquad \mathrm {and} \qquad s^b \nabla _{[b} \varTheta _{a]} = 0 . \end{aligned}$$
(13.54)

Now, since the superfluid constituent is irrotational we also have

$$\begin{aligned} \nabla _{[a} \mu ^\mathrm {n}_{b]} = 0 . \end{aligned}$$
(13.55)

The particle conservation law for the matter component is, of course, unaffected by this constraint. This shows how easy it is to restrict the multi-fluid equations to the case where one (or several) components are irrotational. It is worth emphasizing that it is the momentum that is quantized, not the velocity. This is an important distinction in situations where entrainment plays a role.

It is instructive to contrast this description with other models, like the potential formulation due to Khalatnikov and Lebedev (1982) and Lebedev and Khalatnikov (1982). We arrive at this alternative formulation in the following way (Carter and Khalatnikov 1994). First of all, we know that the irrotationality condition implies that the particle momentum can be written as a gradient of a scalar potential, \(\varphi \) (say). That is, we have

$$\begin{aligned} V_a = - \frac{\mu ^\mathrm {n}_a}{m} = - \nabla _a \varphi . \end{aligned}$$
(13.56)

Here m is the mass of the Helium atom and \(V_a\) is traditionally (and somewhat confusedly, see the previous section) referred to as the “superfluid velocity”. It really is a rescaled momentum. Next assume that the momentum of the remaining fluid (in this case, the entropy) is written

$$\begin{aligned} \mu ^\mathrm {s}_a = \varTheta _a = \kappa _a + \nabla _a \phi . \end{aligned}$$
(13.57)

Here \(\kappa _a\) is Lie transported along the entropy flow provided that \(s^a \kappa _a = 0\) (assuming that the equation of motion (13.54) is satisfied). This leads to

$$\begin{aligned} s^a \nabla _a \phi = s^a \varTheta _a . \end{aligned}$$
(13.58)

There is now no loss of generality in introducing further scalar potentials \(\beta \) and \(\gamma \) such that \(\kappa _a = \beta \nabla _a \gamma \), where the potentials are constant along the flow-lines as long as

$$\begin{aligned} s^a \nabla _a \beta = s^a \nabla _a \gamma = 0. \end{aligned}$$
(13.59)

Given this, we have

$$\begin{aligned} \varTheta _a = \nabla _a \phi + \beta \nabla _a \gamma . \end{aligned}$$
(13.60)

Finally, comparing to Khalatnikov’s formulation (Khalatnikov and Lebedev 1982; Lebedev and Khalatnikov 1982) we define \(\varTheta _a = - \kappa w_a\) and let \(\phi \rightarrow \kappa \zeta \) and \(\beta \rightarrow \kappa \beta \). Then we arrive at the final equation of motion

$$\begin{aligned} - \frac{\varTheta _a}{\kappa } = w_a = - \nabla _a \zeta - \beta \nabla _a \gamma . \end{aligned}$$
(13.61)

Equations (13.56) and (13.61), together with the standard particle conservation laws, are the key equations of the potential formulation. The content of this description is (obviously) identical to that of the variational picture, and we have now seen how the various quantities can be related.

This example shows how easy it is to specify the equations that we derived earlier to the case when one (or several) components are irrotational/superfluid.

Another alternative approach, related to the field theory inspired discussion in Sect. 6.4, is based on the notion of broken symmetries. At a very basic level, a model with a broken U(1) symmetry corresponds to the superfluid model described above. In essence, the superfluid flow introduces a preferred direction which break the assumption that the model is isotropic. At first sight our equations differ from those used in, for example, Son (2001), Pujol and Davesne (2003) and Zhang (2002), but it is easy to demonstrate that we can reformulate our equations to get those written down for a system with a broken U(1) symmetry. The exercise is of interest since it connects with models that have been used to describe other superfluid systems.

Take as starting point the general two-fluid system. From the discussion in Sect. 9, we know that the momenta are in general related to the fluxes via

$$\begin{aligned} \mu ^{\mathrm {x}}_a = \mathcal{B}^{\mathrm {x}}n_a^{\mathrm {x}}+ \mathcal{A}^{{\mathrm {x}}{\mathrm {y}}} n_a^{\mathrm {y}}. \end{aligned}$$
(13.62)

Suppose that, instead of using the fluxes as our key variables, we consider a “hybrid” formulation based on a mixture of fluxes and momenta. In the case of the particle-entropy system, we may use

$$\begin{aligned} n_a^\mathrm {n}= \frac{1}{\mathcal{B}^\mathrm {n}} \mu _a^\mathrm {n}- \frac{\mathcal{A}^{\mathrm {n}\mathrm {s}}}{\mathcal{B}^\mathrm {n}} n_a^\mathrm {s}. \end{aligned}$$
(13.63)

Let us impose irrotationality on the fluid by representing the momentum as the gradient of a scalar potential \(\varphi \). With \(\mu _a^\mathrm {n}= \nabla _a \varphi \) we get

$$\begin{aligned} n_a^\mathrm {n}= \frac{1}{\mathcal{B}^\mathrm {n}} \nabla _a \varphi - \frac{\mathcal{A}^{\mathrm {n}\mathrm {s}}}{\mathcal{B}^\mathrm {n}} n_a^\mathrm {s}. \end{aligned}$$
(13.64)

Now take the preferred frame to be that associated with the entropy flow, i.e. introduce the unit four velocity \(u^a\) such that \(n_\mathrm {s}^a = n_\mathrm {s}u^a = s u^a\). Then we have

$$\begin{aligned} n_a^\mathrm {n}= n u_a - V^2 \nabla _a \varphi \end{aligned}$$
(13.65)

where we have defined

$$\begin{aligned} n \equiv - \frac{s \mathcal{A}^{\mathrm {n}\mathrm {s}}}{\mathcal{B}^\mathrm {n}} \qquad \text{ and } \qquad V^2 = - \frac{1}{\mathcal{B}^\mathrm {n}} . \end{aligned}$$
(13.66)

With these definitions, the particle conservation law becomes

$$\begin{aligned} \nabla _a n_\mathrm {n}^a = \nabla _a \left( n u^a - V^2 \nabla ^a \varphi \right) = 0 . \end{aligned}$$
(13.67)

Meanwhile, the chemical potential in the entropy frame follows from

$$\begin{aligned} \mu = - u^a \mu ^\mathrm {n}_a = - u^a \nabla _a \varphi . \end{aligned}$$
(13.68)

One can also show that the stress-energy tensor becomes

$$\begin{aligned} T^a{}_b = \varPsi \delta ^a{}_b + (\varPsi + \rho ) u^a u_b - V^2 \nabla ^a \varphi \nabla _b \varphi , \end{aligned}$$
(13.69)

where the generalized pressure is given by \(\varPsi \) as usual, and we have introduced

$$\begin{aligned} \varPsi + \rho = \mathcal{B}^\mathrm {s}s^2 + \mathcal{A}^{\mathrm {s}\mathrm {n}} s n . \end{aligned}$$
(13.70)

The equations of motion can now be obtained from \(\nabla _b T^b{}_a = 0\). (Keeping in mind that the equation of motion for \(\mathrm {x}=\mathrm {n}\) is automatically satisfied once we impose irrotationality, as before.) This essentially completes the set of equations written down by, for example, Son (2001) (see also Gusakov and Andersson 2006; Kantor and Gusakov 2011). The argument in favour of this formulation is that it is close to the microphysics calculations, which means that the parameters may be relatively straightforward to obtain. Against the description is the fact that it is a—not very elegant—hybrid where the inherent symmetry amongst the different constituents is lost, and there is also a risk of confusion since one is treating a momentum as if it were a velocity.

In the case when the superfluid rotates, the two-fluid equations apply as long as the rotation is sufficiently fast that one can meaningfully average over the vortex array. In effect, we assume that we can “ignore” the smaller scales associated with, for example, the vortex cores. This may not be possible in all situations, and even if it is, the “effective” parameters on the averaged scale may depend on the more local physics. For example, averaging may be appropriate to describe rotating superfluid neutron stars, but it is easy to construct laboratory systems where averaging is not appropriate. One may also envisage cosmological settings, e.g., involving dark matter condensates (Harko 2011), where averaging is not possible. In such situations we have to pay more careful attention to the forces acting on the vortices and the ensuing motion.

figure t

13.4 Vortices and mutual friction

Due to the fundamental quantum nature of superfluid (and for that matter, superconducting) condensates, the neutron component in a neutron star core will be quantized into localized vortices that each carry a single quantum of momentum circulation. For simplicity, we will assume that the vortices are locally arranged in a rectilinear array, directed along a unit vector \(\hat{\kappa }^i\), with surface density \(\mathcal {N} \). At the hydrodynamics level, after averaging and in the Newtonian gravity framework, we then have

$$\begin{aligned} \mathcal {W}^i_\mathrm {n}= \frac{1}{m} \epsilon ^{ijk} \nabla _j p^\mathrm {n}_k = \mathcal {N} \kappa ^i , \end{aligned}$$
(13.71)

where we have used \(\kappa ^i = \kappa \hat{\kappa }^i \) with \(\kappa = h / 2m\) the quantum of circulation (the factor of 2 arises from the underlying Cooper pairing, relevant for superfluid neutrons). It is important to note that the quantized “vorticities” refer to the circulation of the canonical momentum \(p^i_\mathrm {n}\) rather than the circulation of velocity. It is the canonical momentum which is related to the gradient of each condensate’s wavefunction phase \( \varphi \), leading to the Onsager–Feynman quantization condition

$$\begin{aligned} \oint p^i_\mathrm {n}dl_i= (\hbar /2) \oint (\nabla ^i \varphi ) dl_i = h/2 . \end{aligned}$$
(13.72)

The variational analysis has already provided us with a two-fluid model that allows for vorticity (obviously). However, if we want to understand the role of the vortices it is useful to consider the problem from a more intuitive (albeit less general) point of view. To do this we generalize an approach that was originally developed in the context of two-fluid hydrodynamics for superfluid Helium (Hall and Vinen 1956). This provides a conceptually different derivation of the Euler equations, based on the kinematics of a conserved number of vortices. It also requires the input of the forces that determine the motion of a single isolated vortex. Thus, consistency between the two derivations allows us to identify the total conservative force exerted on a single vortex, without any need to study the detailed mesoscopic vortex-fluid interaction—at least as long as the vortices are locally aligned. This will be useful when we consider the vortex mediated friction later.

figure u

The starting point of the derivation is the Onsager–Feynman condition (13.71). We also need to use the fact that the vortex number density is conserved, i.e. \(\mathcal {N} \) obeys a continuity equation of the form

$$\begin{aligned} \partial _t \mathcal {N} + \nabla _j \left( \mathcal {N} v_\mathrm {v}^j\right) =0 , \end{aligned}$$
(13.73)

where \(v_\mathrm {v}^i\) is the collective vortex velocity within a typical fluid element—in a sense, this relation defines this averaged vortex velocity. Taking the time derivative of (13.71) we have

$$\begin{aligned} \partial _t \mathcal {W}^i_\mathrm {n}= -\kappa ^i \nabla _j ( \mathcal {N} v_\mathrm {v}^j ) + \mathcal {N} \partial _t \kappa ^i . \end{aligned}$$
(13.74)

Reshuffling terms and using the identity \(\nabla _i \mathcal {W}^i_\mathrm {n}=0 \) we obtain

$$\begin{aligned} \partial _t \mathcal {W}^i_\mathrm {n}= \nabla _j \left( \mathcal {W}^j_\mathrm {n}v_\mathrm {v}^i \right) - \nabla _j \left( \mathcal {W}^i_\mathrm {n}v_\mathrm {v}^j \right) + \mathcal {N} \left( \partial _t \kappa ^i + v_\mathrm {v}^j \nabla _j \kappa ^i -\kappa ^j \nabla _j v_\mathrm {v}^i \right) . \end{aligned}$$
(13.75)

The motion of a single vortex can be expressed as the Lie-dragging of the vector \(\kappa ^i \) (which designates the local vortex direction) by the \(v_\mathrm {v}^i \) flow, leading to

$$\begin{aligned} \partial _t \kappa ^i + \mathcal{L}_{v_\mathrm {v}} \kappa ^i=0 . \end{aligned}$$
(13.76)

Then (13.75) reduces to

$$\begin{aligned} \partial _t \mathcal {W}^i_\mathrm {n}+ \epsilon ^{ijk} \nabla _j \left( \epsilon _{klm} \mathcal {W}^l_\mathrm {n}v_\mathrm {v}^m\right) =0 . \end{aligned}$$
(13.77)

which states that the canonical vorticity \( \mathcal {W}^i_\mathrm {n}\) is locally conserved and advected by the \(v_\mathrm {v}^i \) flow. Rewriting the result in terms of the momentum, we have

$$\begin{aligned} \partial _t p^i_\mathrm {n}-\epsilon ^{ijk} v_{\mathrm {v}_j} \epsilon _{klm} \nabla ^l p^m_\mathrm {n}= \nabla ^i \varPsi , \end{aligned}$$
(13.78)

where \(\varPsi \) is a (so far unspecified) scalar potential.

Making use of the relative velocity, \( w^i_\mathrm{nv} = v^i_\mathrm {n}- v_\mathrm {v}^i\), we subsequently write (13.78) as

$$\begin{aligned} n_\mathrm {n}\partial _t p^i_\mathrm {n}- \epsilon ^{ijk} n_j^\mathrm {n}\epsilon _{klm} \nabla ^l p^m_\mathrm {n}-n_\mathrm {n}\nabla ^i \varPsi _\mathrm {n}= \mathcal {N} \rho _\mathrm {n}\epsilon ^{ijk} \kappa _j w_k^\mathrm{nv}. \end{aligned}$$
(13.79)

The left-hand-side of this equation coincides with the vortex-free Euler equations of motion (13.33) after a suitable identification of the potential \(\varPsi \). The right-hand side appears only in the presence of vortices. We can trace the origin of this contribution back to the Magnus force exerted on a vortex (per unit length) by the associated fluid given by

$$\begin{aligned} f^i_\mathrm{M} = -\rho _\mathrm {n}\epsilon ^{ijk} \kappa _j w^\mathrm{nv}_k . \end{aligned}$$
(13.80)

Thus, we identify \(-\mathcal {N} f^i_\mathrm{M}\), the right-hand side of (13.79), as the averaged reaction force exerted on a fluid element by the vortex array. In the absence of balancing forces, like dissipative scattering off thermal excitations, the equation of motion for a single vortex leads to \(f^i_\mathrm{M} =0\), implying that the vortices must move along with \(v_\mathrm {n}^i\) flow. In this case, we retain (13.33) as the appropriate equation of motion.

This situation is, of course, somewhat artificial. In order for the argument to make sense, something must prevent the vortices from moving with the bulk flow. Of course, in order to describe a real superfluid, either at finite temperatures or co-existing with some other component (as in a neutron star core) we need (at least) two components. The interaction between the vortices and this second component effects the relative vortex flow. This interaction tends to be dissipative. The standard example of this is the so-called mutual friction which assumes that the Magnus force acting on each vortex is balanced by resistivity with respect to the second component in the system (e.g., the thermal excitations in Helium, represented by \({\mathrm {x}}=\mathrm {p}\) here). That is we have (Hall and Vinen 1956; Mendell 1991b; Andersson et al. 2006)

$$\begin{aligned} f^i_\mathrm{M} = -\rho _\mathrm {n}\epsilon ^{ijk} \kappa _j w^\mathrm{nv}_k = - \mathcal {R} w^i_\mathrm{vp} \end{aligned}$$
(13.81)

which leads to—after repeated cross products to isolate the vortex velocity—the force acting on the superfluid neutronsFootnote 25;

$$\begin{aligned} f^\mathrm {n}_i = \rho _\mathrm {n}\mathcal {N} \kappa \left( \mathcal {B}' \epsilon _{ijk} \hat{\kappa }^j w_{\mathrm {n}\mathrm {p}}^k + \mathcal {B} \epsilon _{ijk}\hat{\kappa }^j \epsilon ^{klm}\hat{\kappa }_l w_m^{\mathrm {n}\mathrm {p}} \right) \ \end{aligned}$$
(13.82)

with

$$\begin{aligned} \mathcal {B}' = \mathcal {R} \mathcal {B} = { \mathcal {R}^2 \over 1 + \mathcal {R}^2 } . \end{aligned}$$
(13.83)

The mutual friction has decisive impact on superfluid dynamics. In particular, it provides one of the main mechanisms for damping (or even preventing) the CFS instability in rotating superfluid neutron stars (Lindblom and Mendell 1995).

13.5 The Kalb–Ramond variation

Moving on to the relativistic description of the quantized vortex problem, we have two options. We could “simply” generalize the steps from the Newtonian case. This is helpful, as it assists the intuition. However, it may be more instructive to take an alternative route. Opting for this strategy—with the view that it will allow us to introduce additional aspects—we now set out to derive the fluid results from a different perspective. The ultimate aim is to arrive at an alternative description of the (suitably averaged) dynamics of a collection of quantized vortices.

The new strategy builds on efforts to relate string dynamics to the forces acting on a superfluid vortex (Lund and Regge 1976; Kalb and Ramond 1974; Davis and Shellard 1988, 1989). We start by recalling that the superfluid velocity (technically; the momentum) can be linked the gradient of a scalar potential \(\varphi \). We identify this velocity as the dualFootnote 26

$$\begin{aligned} \tilde{H}_a = \eta \partial _a \varphi = {1\over 3!} \epsilon _{abcd} H^{bcd} , \end{aligned}$$
(13.84)

where \(\eta \) is a constant, and introduce the so-called Kalb–Ramond field (Kalb and Ramond 1974), such that

$$\begin{aligned} H^{abc} = \partial ^{[a} B^{bc]} . \end{aligned}$$
(13.85)

It is now easy to see that the scalar wave equation

$$\begin{aligned} \Box \varphi = 0 , \end{aligned}$$
(13.86)

is automatically satisfied, as long as

$$\begin{aligned} \nabla _a \left( \nabla ^a B^{bc} + \nabla ^c B^{ab} + \nabla ^b B^{ca} \right) = 0 . \end{aligned}$$
(13.87)

In effect, we can shift the focus from \(\varphi \) to \(B^{ab}\), treating this object as an independent variable. The relevant dynamical equations are then automatically solved by expressing this field in terms of a scalar potential. The two descriptions are complementary, as they have to be (Davis and Shellard 1988). However, as we will soon demonstrate, the Kalb–Ramond representation makes the introduction of topological defects (vortices/strings) intuitive.

First, let us return to the fluid problem but shift the attention from the matter flux to the vorticity. Following Carter (1994, 2000) and Carter and Langlois (1995b), we do this by noting that we can ensure that the conservation law (6.8) is automatically satisfied by introducing a two-form \(B_{ab}\) (the Kalb–Ramond field) such that

$$\begin{aligned} n_{abc} = 3 \nabla _{[a}B_{bc]} \end{aligned}$$
(13.88)

That is, we have

$$\begin{aligned} n^a = {1\over 2} \epsilon ^{abcd} \nabla _b B_{cd} \end{aligned}$$
(13.89)

and the flux conservation (6.8) follows as an identity—we no longer need to introduce the three-dimensional matter space.

Second, in order to find an action that reproduces the perfect fluid results, we elevate the vorticity \(\omega _{ab}\) to an additional variable. A Legendre transformation—designed in such a way that the stress-energy tensor remains unchanged (Carter and Langlois 1995b)—leads to the Lagrangian

$$\begin{aligned} \bar{\varLambda } = \varLambda - {1\over 4} \epsilon ^{abcd} B_{ab} \omega _{cd} = \varLambda - {1\over 2} \tilde{\omega }^{ab} B_{ab} , \end{aligned}$$
(13.90)

where we have used the dual

$$\begin{aligned} \tilde{\omega }^{ab} = {1\over 2} \epsilon ^{abcd} \omega _{cd} . \end{aligned}$$
(13.91)

Assuming that \(\varLambda =\varLambda (n)\) we get (ignoring the perturbed metric for clarity)

$$\begin{aligned} \delta \bar{\varLambda } = -{1\over 3!} \mu ^{abc} \delta n_{abc} - {1\over 2} B_{ab} \delta \tilde{\omega }^{ab} - {1\over 2} \tilde{\omega }^{ab} \delta B_{ab} , \end{aligned}$$
(13.92)

where we note that, cf. Sect. 6,

$$\begin{aligned} {\partial \varLambda \over \partial n_{abc} } = - {1\over 3!} \mu ^{abc} . \end{aligned}$$
(13.93)

However, we now have

$$\begin{aligned} \delta n_{abc} = 3 \nabla _{[a} \delta B_{bc]} , \end{aligned}$$
(13.94)

which means that

$$\begin{aligned} \delta \bar{\varLambda } ={1\over 2} \left( \nabla _{a} \mu ^{abc} - \tilde{\omega }^{bc} \right) \delta B_{bc}- {1\over 2} B_{ab} \delta \tilde{\omega }^{ab} -{1\over 2} \nabla _a \left( \mu ^{abc} \delta B_{bc}\right) . \end{aligned}$$
(13.95)

Ignoring the surface term (as usual), we see that a variation with respect to \(B_{ab}\) requires

$$\begin{aligned} \tilde{\omega }^{bc} = \nabla _{a} \mu ^{abc} , \end{aligned}$$
(13.96)

which leads back to (6.29). However, with a free variation we would also have \(B_{ab}=0\). That is, we need to constrain the variation of \(\tilde{\omega }^{ab}\) (or rather \(\omega _{ab}\)). Fortunately, the matter space argument comes to the rescue, providing us with the strategy for doing this. The only difference is that we now make use of a two-dimensional space with coordinates \(\chi ^I\) (here, and in the following \(I,J,\ldots \) represent two-dimensional coordinates). We obtain this two-dimensional space either via a map from the original matter space

$$\begin{aligned} \hat{\psi }^I_A = {\partial \chi ^I \over \partial X^A} , \end{aligned}$$
(13.97)

or directly from spacetime, using

$$\begin{aligned} \bar{\psi }^I_a = {\partial \chi ^I \over \partial x^a}. \end{aligned}$$
(13.98)

The two descriptions are consistent since

$$\begin{aligned} \bar{\psi }^I_a =\hat{\psi }^I_A \psi ^A_a = {\partial \chi ^I \over \partial X^A} {\partial X^A \over \partial x^a} = {\partial \chi ^I \over \partial x^a} . \end{aligned}$$
(13.99)

The different coordinates and the maps are illustrated in Fig. 15.

Fig. 15
figure 15

An illustration of the matter space maps and the coordinates used in the analysis of vortex dynamics and elasticity

The third step involves introducing the four velocity \(u^a\) associated with the motion of the vortices in spacetime, which may be different from the motion of the “fluid” (in turn related to \(n^a\)). In order for the vorticity to be a purely spatial object—orthogonal to the flow—we must have

$$\begin{aligned} u^a\omega _{ab} = 0 . \end{aligned}$$
(13.100)

In addition, we want it to be “fixed” in the (new) matter space, in the sense that

$$\begin{aligned} \mathcal {L}_u \omega _{ab} = 0 . \end{aligned}$$
(13.101)

Since \(\omega _{ab}\) is anti-symmetric, this leads to

$$\begin{aligned} u^c \nabla _{[a}\omega _{bc]} = 0 , \end{aligned}$$
(13.102)

which will be satisfied if

$$\begin{aligned} \nabla _{[a}\omega _{bc]} = \partial _{[a}\omega _{bc]} = 0 . \end{aligned}$$
(13.103)

Adapting the logic that led to the conserved matter flux in Sect. 6, we introduce the matter space tensor \(\omega _{IJ}\) (associated with two-dimensional space orthogonal to the vortex world sheet), such that

$$\begin{aligned} \omega _{ab} = \psi ^A_a \psi ^B_b \omega _{AB} = \bar{\psi }^I_a \bar{\psi }^J_b \omega _{IJ} . \end{aligned}$$
(13.104)

Noting that (13.103) becomes

$$\begin{aligned} \partial _{[a}\omega _{bc]} = \bar{\psi }^I_a \bar{\psi }^J_b \bar{\psi }^K_c \partial _{[I} \omega _{JK]} = 0 , \end{aligned}$$
(13.105)

it follows that the condition holds as long as \(\omega _{IJ}\) only depends on the \(\chi ^I\) coordinates. It should (by now) be a familiar argument.

Next, we introduce Lagrangian perturbations such that

$$\begin{aligned} \varDelta \chi ^I = 0 \longrightarrow \delta \chi ^I = - \mathcal {L}_\xi \chi ^I , \end{aligned}$$
(13.106)

and we have

$$\begin{aligned} \varDelta \omega _{ab} = 0 . \end{aligned}$$
(13.107)

Again leaving out the metric variations, we have

$$\begin{aligned} \delta \tilde{\omega }^{ab} = {1\over 2} \epsilon ^{abcd} \delta \omega _{cd} = - \xi ^c \nabla _c \tilde{\omega }^{ab} - \epsilon ^{abcd} \omega _{ed} \nabla _c \xi ^e , \end{aligned}$$
(13.108)

and, after a little bit of work, the middle term in (13.95) becomes

$$\begin{aligned} - {1\over 2} B_{ab} \delta \tilde{\omega }^{ab} = {3\over 2} \xi ^c \tilde{\omega }^{ab} \nabla _{[c} B_{ab]} + \nabla _c \left( \omega ^{ab} B_{ab} \xi ^c\right) . \end{aligned}$$
(13.109)

We have have noted that, (13.96) implies that

$$\begin{aligned} \nabla _a \tilde{\omega }^{ab} = 0 . \end{aligned}$$
(13.110)

Finally, we see that a variation with respect to \(\xi ^a\) leads to

$$\begin{aligned} {3\over 2} \tilde{\omega }^{ab} \nabla _{[c} B_{ab]} = {1\over 4} \epsilon ^{abde} \omega _{de} n_{cab} = n^d \omega _{dc} = 0 , \end{aligned}$$
(13.111)

and we recover the usual fluid equations. This completes the initial argument. The introduction of the Kalb–Ramond field shifts the focus onto the vorticity, which is associated with a two-dimensional subspace (replacing the usual three-dimensional matter space). The key point is that we arrive at fluid equations without explicitly associating the fluid flux \(n^a\) with the four-velocity \(u^a\).

13.6 String fluids

In order to form a complete picture—including connections with related problems—and develop the tools we need to make progress, it is useful to take a slight detour in the direction of string theory. The key point is that, a one-dimensional string moving through spacetime traces out a two-dimensional world sheet. This world sheet is spanned by two vectors, one timelike (here taken to be the four velocity of the string, \(u^a\)) and one spacelike (intuitively, the tangent vector to the string, represented by \(\hat{\kappa }^a\)). These vectors are associated with two-dimensional coordinatesFootnote 27 such that \(x^a = x^a (\phi ^I)\), leading to the tangent surface element

$$\begin{aligned} S^{ab} = \bar{\epsilon }^{IJ} {\partial x^a \over \partial \phi ^I} {\phi x^b \over \partial \phi ^J} , \end{aligned}$$
(13.112)

with \(\bar{\epsilon }^{IJ}\) the (normalized) two-dimensional Levi-Civita tensor (density), representing the measure tensor for the two-dimensional surface orthogonal to the vortex world sheet.

Associated with this world sheet we have a bivector (read: an anti-symmetric tensor of rank 2), to be denoted \(\varSigma ^{ab}\). This object can be expressed in terms of the linearly independent vectors that span the surface; as the bivector spans a surface, it is natural to think of it as a contravariant object. Noting that a simple timelike bivector can be written as the alternating product of a timelike and a spacelike vector (Stachel 1980) (such that its dual will be a simple spacelike bivector) and assuming the normalisation

$$\begin{aligned} \varSigma _{ab} \varSigma ^{ab} = -2 , \end{aligned}$$
(13.113)

we may use

$$\begin{aligned} \varSigma ^{ab} = u^a \hat{\kappa }^b - u^b \hat{\kappa }^a , \end{aligned}$$
(13.114)

such that

$$\begin{aligned} \hat{\kappa }^a = \varSigma ^{ab} u_b . \end{aligned}$$
(13.115)

The projection into the two-dimensional space spanned by \(u^a\) and \(\hat{\kappa }^a\) is then given by

$$\begin{aligned} \varSigma ^{ac} \varSigma _{cb} = \hat{\kappa }^a \hat{\kappa }_b - u^a u_b . \end{aligned}$$
(13.116)

Introducing the dual

$$\begin{aligned} \tilde{\varSigma }_{ab} = {1\over 2} \epsilon _{abcd} \varSigma ^{cd} = \epsilon _{abcd} u^c \hat{\kappa }^d , \end{aligned}$$
(13.117)

we also have the orthogonal projection

$$\begin{aligned} \tilde{\perp }^a_{\ b} = \tilde{\varSigma }^{ac} \tilde{\varSigma }_ {cb} = \delta ^a_b + u^a u_b - \hat{\kappa }^a \hat{\kappa }_b , \end{aligned}$$
(13.118)

and we see that

$$\begin{aligned} \tilde{\varSigma }_{ab} \varSigma ^{bc} = 0 . \end{aligned}$$
(13.119)

In fact, this result follows immediately from the condition that the bivector is simple:

$$\begin{aligned} \varSigma ^{[ab} \varSigma ^{c]d} = 0 \quad \Leftrightarrow \quad \varSigma ^{ab} \varSigma ^{cd} \epsilon _{abce} = 0 . \end{aligned}$$
(13.120)

Finally, the bivector is surface forming, as long as (Stachel 1980)

$$\begin{aligned} \tilde{\varSigma }_{ab} \nabla _c \varSigma ^{bc} = \tilde{\varSigma }_{ab} \partial _c \varSigma ^{bc} = 0 . \end{aligned}$$
(13.121)

With this set up, we may take the bivector to be proportional to the surface element. Letting

$$\begin{aligned} \varSigma ^{ab} = \alpha ^{-1/2} S^{ab} , \end{aligned}$$
(13.122)

we have

$$\begin{aligned} \varSigma ^{IJ} = \alpha ^{-1/2} S^{IJ} = \alpha ^{-1/2} \bar{\epsilon }^{IJ} . \end{aligned}$$
(13.123)

Making use of the induced metric (which we also use to raise and lower indices in the two-dimensional subspace)

$$\begin{aligned} \gamma _{IJ} = g_{ab} {\partial x^a \over \partial \phi ^I} {\partial x^b \over \partial \phi ^J} , \end{aligned}$$
(13.124)

we have

$$\begin{aligned} \gamma _{IK} \gamma _{JL} \bar{\epsilon }^{IJ} \bar{\epsilon }^{KL} = - 2 \alpha , \end{aligned}$$
(13.125)

and hence we identify

$$\begin{aligned} \alpha = - \gamma = - \mathrm {det}\ \gamma ^{AB} . \end{aligned}$$
(13.126)

That is, we arrive at

$$\begin{aligned} \varSigma ^{ab} = \sqrt{-\gamma } {\partial x^a \over \partial \phi ^I} {\partial x^b \over \partial \phi ^J} \bar{\epsilon }^{IJ} . \end{aligned}$$
(13.127)

Geometrically, the dual of \(\varSigma ^{ab}\) is a two-form that represents (when integrated) the flux carried by vortices (string) across a surface in spacetime. The variable \(\gamma \) is a measure of this flux.

Let us now assume that the Lagrangian of the system depends on \(\gamma \) (note that \(\gamma \) is formally treated as a variational quantity even though its value is constrained), with

$$\begin{aligned} \gamma = {1\over 2} \varSigma ^{ab} \varSigma _{ab} = {1\over 2} \varSigma ^{IJ} \varSigma _{IJ} = -1 . \end{aligned}$$
(13.128)

Moreover, as we want to compare to a model based on averaging over a set of vortices—treated as a fluid described by a small number of fields (density, velocity, tension etcetera), an idea that applies to a variety of systems from polymers to cosmic strings—it is natural to consider the analogous example of a coarse-grained “string fluid” (Schubring and Vanchurin 2014, 2015; Schubring 2015). In effect, we take \(\sqrt{-g} \varLambda (\gamma )\) to be the matter contribution to the action. Further, if we let \(\varLambda = M \sqrt{-\gamma }\) this leads to the coarse-grained version of the standard Nambu–Goto string action (Letelier 1979; Vilenkin and Shellard 1994), with M the string tension.

For the stress-energy tensor we now need

$$\begin{aligned} \delta \varLambda= & {} {d\varLambda \over d\gamma } \left( {\partial \gamma \over \partial \varSigma ^{ab} } \delta \varSigma ^{ab} + {\partial \gamma \over \partial g_{ab}} \delta g_{ab} \right) \nonumber \\= & {} {d\varLambda \over d\gamma } \left( \varSigma _{ab} \delta \varSigma ^{ab} + \varSigma _c^{\ a} \varSigma ^{cb} \delta g_{ab} \right) , \end{aligned}$$
(13.129)

which leads to

$$\begin{aligned} T^{ab} = \varLambda g^{ab} + 2 {\delta \varLambda \over \delta g_{ab}} = \varLambda g^{ab} + 2 {d\varLambda \over d\gamma } \varSigma _c^{\ a} \varSigma ^{cb} . \end{aligned}$$
(13.130)

From this it follows that the equations of motion are

$$\begin{aligned} \nabla _a T^{ab} = g^{ab} \nabla _a \varLambda + 2 \varSigma _c^{\ a} \varSigma ^{cb} \nabla _a \left( {d\varLambda \over d\gamma } \right) + 2 {d\varLambda \over d\gamma } \nabla _a \left( \varSigma ^a_{\ c} \varSigma ^{cb}\right) = 0 . \end{aligned}$$
(13.131)

However, we have

$$\begin{aligned} \nabla _a \varLambda = {d\varLambda \over d\gamma } \nabla _a \gamma = 0 , \end{aligned}$$
(13.132)

and

$$\begin{aligned} \nabla _a \left( {d\varLambda \over d\gamma } \right) = \left( {d^2\varLambda \over d\gamma ^2} \right) \nabla _a \gamma = 0 , \end{aligned}$$
(13.133)

since \(\gamma =-1\). This means that we have

$$\begin{aligned}&\nabla _a \left( \varSigma ^a_{\ c} \varSigma ^{cb}\right) = \varSigma ^{cb} \nabla _a \varSigma ^a_{\ c} + {1 \over 2} \varSigma _{ca} \left( \nabla ^a \varSigma ^{cb} +\nabla ^c \varSigma ^{ba} + \nabla ^b \varSigma ^{ca} \right) \nonumber \\&\quad = \varSigma ^{cb} \nabla _a \varSigma ^a_{\ c} + 3 \varSigma _{ca} \nabla ^{[a} \varSigma ^{cb]} =0 , \end{aligned}$$
(13.134)

where we have used (13.113). Following Stachel (1980), we contract with \(\varSigma _{db}\) to get

$$\begin{aligned} \varSigma _{db} \varSigma ^{cb} \nabla _a \varSigma ^a_{\ c} + 3 \varSigma _{[ac} \varSigma _{b]d} \nabla ^{[a} \varSigma ^{cb]} =0 , \end{aligned}$$
(13.135)

where the second term vanishes since the bivector is simple, cf. (13.120). Noting also that

$$\begin{aligned} \varSigma _{db} \varSigma ^{cb} \nabla _a \varSigma ^a_{\ c} = 0 \Longrightarrow \varSigma _{dc} \nabla _a \varSigma ^{ac} =0 . \end{aligned}$$
(13.136)

and considering (13.121), we infer the conservation law (Stachel 1980; Schubring and Vanchurin 2015)

$$\begin{aligned} \nabla _a \varSigma ^{ab} = 0 . \end{aligned}$$
(13.137)

Basically, if the contractions of a vector with both the bivector and the dual vanish then the vector must itself be zero. Returning to the equations of motion, we are left with

$$\begin{aligned} \varSigma ^a_{\ c} \nabla _a \varSigma ^{cb} = 0, \end{aligned}$$
(13.138)

or

$$\begin{aligned} \perp ^c_b \left( \hat{\kappa }^a \nabla _a \hat{\kappa }^b - u^a \nabla _a u^b \right) = 0 . \end{aligned}$$
(13.139)

This is the simplest version of the model and it is all we need for now. Still, it is interesting to note extensions like the dissipative case considered in Schubring and Vanchurin (2015) and the discussion of charged cosmic strings in Carter (1989b).

Before we move on, let us establish two useful results. First of all, we have

$$\begin{aligned} \hat{\kappa }^a = \varSigma ^{ab} u_b \Longrightarrow \nabla _a \hat{\kappa }^a + u^a u_b \nabla _a \hat{\kappa }^b = u_b \nabla _a \varSigma ^{ab} = 0 , \end{aligned}$$
(13.140)

by virtue of (13.137). Similarly

$$\begin{aligned} u^a = \varSigma ^{ab} \hat{\kappa }_b \Longrightarrow \nabla _a u^a - \hat{\kappa }^a \hat{\kappa }_b \nabla _a u^b = \hat{\kappa }_b \nabla _a \varSigma ^{ab} = 0 . \end{aligned}$$
(13.141)

These will be required later.

13.7 Vortex dynamics

A natural extension to the fluid model allows \(\varLambda \) to depend on both \(n_{abc}\) and \(\omega _{ab}\) from the outset. Starting from \(\varLambda = \varLambda (n_{abc}, \omega _{ab}, g^{ab})\) we immediately have

$$\begin{aligned} \delta \varLambda = -{1\over 3!} \mu ^{abc} \delta n_{abc} - {1\over 2} \lambda ^{ab} \delta \omega _{ab} + {\delta \varLambda \over \delta g^{ab}} \delta g^{ab} , \end{aligned}$$
(13.142)

where

$$\begin{aligned} \lambda ^{ab} = - 2 {\partial \varLambda \over \partial \omega _{ab} } . \end{aligned}$$
(13.143)

From (13.90) it then follows that (ignoring the metric variation and the surface term, as before)

$$\begin{aligned} \delta \tilde{\varLambda } = {1\over 2} \left( \nabla _{c} \mu ^{cab}- \tilde{\omega }^{ab} \right) \delta B_{ab} - {1\over 2} \left( \lambda ^{cd} + {1\over 2} \epsilon ^{abcd} B_{ab} \right) \delta \omega _{cd} , \end{aligned}$$
(13.144)

which leads us back to (13.95) and (13.96). However, we now have an additional term involving \(\delta \omega _{ab}\). Making use of (13.103), this new term can be written

$$\begin{aligned} -{1\over 2} \lambda ^{cd} \delta \omega _{cd} = {1\over 2} \lambda ^{cd}\left( \xi ^a \nabla _a \omega _{cd} + 2 \omega _{ad} \nabla _c \xi ^a \right) = - \xi ^a \omega _{ad} \nabla _c \lambda ^{cd} . \end{aligned}$$
(13.145)

Combining this with the result from the previous section, we see that a variation with respect to the displacement leads to (see Carter 1994, 2000; Carter and Langlois 1995b)

$$\begin{aligned} n^a \omega _{ab} = \omega _{ab} \nabla _c \lambda ^{ca} = - 2 \omega _{ab} \nabla _c \left( {\partial \varLambda \over \partial \omega _{ca} } \right) . \end{aligned}$$
(13.146)

The explicit dependence on the vorticity has led to amended equations of motion. In order to interpret the term on the right-hand side of (13.146) we, first of all, note that we may write (13.146) as

$$\begin{aligned} \left[ n^a + 2 \nabla _c \left( {\partial \varLambda \over \partial \omega _{ca} } \right) \right] \omega _{ab} \equiv \bar{n}^a \omega _{ab} = 0 , \end{aligned}$$
(13.147)

with

$$\begin{aligned} \bar{n}^a = n^a + 2 \nabla _c \left( {\partial \varLambda \over \partial \omega _{ca} } \right) . \end{aligned}$$
(13.148)

This makes the result appear more “familiar”, but it does not really help us understand the contributions to (13.146).

Let us dig deeper. Consider the implications of the two-dimensional matter space we introduced for the vorticity, see Fig. 15. Intuitively, the idea makes sense for a collection of (locally) aligned quantized vortices as one can always introduce a two-dimensional surface orthogonal to the vortex array. Points in this surface are described by the \(\chi ^I\) coordinates. Not surprisingly, we can adapt the logic from the usual matter-space construction to this new setting—although in doing so we focus on the map from the original three-dimensional space to the two-dimensional one. As is evident from (13.105), we also need the map from spacetime to either low-dimensional space. The original fluid derivation involved

$$\begin{aligned} \psi ^A_b \psi ^a_A = \perp ^a_{\ b} , \end{aligned}$$
(13.149)

while the corresponding map to the two-dimensional stage takes the form

$$\begin{aligned} \hat{\psi }^I_B \hat{\psi }^A_I = \delta ^A_B - \hat{\kappa }^A \hat{\kappa }_B , \end{aligned}$$
(13.150)

with a suitable spatial unit vector \(\hat{\kappa }^a\), automatically orthogonal to the four velocity \(u^a\) since

$$\begin{aligned} u^a \hat{\kappa }_a = (u^a \psi ^A_a) \hat{\kappa }^A = 0 . \end{aligned}$$
(13.151)

We will take the new vector \(\hat{\kappa }^a\) to be normal to the area spanned by the \(\chi ^I\) coordinates (and identify it with the spacelike coordinate used to describe the string world sheet). That is, we have

$$\begin{aligned} \hat{\kappa }^A \hat{\psi }^I_A = 0 . \end{aligned}$$
(13.152)

In essence, \(\hat{\kappa }^A\) is aligned with the quantized vortices. It also follows that

$$\begin{aligned} \bar{\psi }^I_a \bar{\psi }^b_I= & {} ( \psi ^A_a \hat{\psi }^I_A) ( \psi ^b_B \hat{\psi }^B_I) = \psi ^A_a \psi ^b_B (\delta _A^B - \hat{\kappa }_A \hat{\kappa }^B )\nonumber \\= & {} \delta _a^b+u_a u^b - \hat{\kappa }_a \hat{\kappa }^b \equiv \tilde{\perp }^b_a . \end{aligned}$$
(13.153)

Turning to the vorticity, it is natural to introduce a vector

$$\begin{aligned} W^A = {1\over 2} \epsilon ^{ABC} \omega _{BC} \longrightarrow \omega _{AB} = \epsilon _{ABC} W^C . \end{aligned}$$
(13.154)

In spacetime, we then have the vorticity vector

$$\begin{aligned} W^a = {1\over 2} \psi ^a_A \epsilon ^{ABC} \omega _{BC} = {1\over 2} \psi ^a_A \psi ^b_B \psi ^c_C \epsilon ^{ABC} \omega _{bc} = {1\over 2} u_d \epsilon ^{dabc} \omega _{bc} , \end{aligned}$$
(13.155)

which is simply related to the dual:

$$\begin{aligned} W^a = u_d \tilde{\omega }^{da} . \end{aligned}$$
(13.156)

We may also work in the two-dimensional space, where it makes sense to let

$$\begin{aligned} \omega _{IJ} = \mathcal {N} \kappa \epsilon _{IJ} \longrightarrow \omega _{AB} = \mathcal {N} \kappa \epsilon _{AB} , \end{aligned}$$
(13.157)

where \(\epsilon _{IJ}\) is the surface measure tensor associated with the vortex world sheet (not to be confused with \(\bar{\epsilon }^{IJ}\) from before), with

$$\begin{aligned} \epsilon _{IJ} \epsilon ^{JK}= & {} \delta _I^K , \end{aligned}$$
(13.158)
$$\begin{aligned} \epsilon _{IJ} \epsilon ^{IJ}= & {} 2 , \end{aligned}$$
(13.159)

and

$$\begin{aligned} \epsilon _{AB} = \hat{\kappa }^C \epsilon _{CAB} . \end{aligned}$$
(13.160)

Letting \(\kappa ^A = \kappa \hat{\kappa }^A\), we now have

$$\begin{aligned} \omega _{AB} = \mathcal {N} \kappa ^C \epsilon _{CAB} , \end{aligned}$$
(13.161)

so

$$\begin{aligned} \kappa ^A \omega _{AB} = 0 . \end{aligned}$$
(13.162)

In fact, we have

$$\begin{aligned} W^A = \mathcal {N} \kappa ^A . \end{aligned}$$
(13.163)

The interpretation of this is intuitive—we have a collection of vortices, each associated with a quantum \(\kappa \) of circulation—with number density (per unit area) \(\mathcal {N} \). It is also worth noting the close resemblance to the various relations for \(n_{ABC}\) from Sect. 6. We also have

$$\begin{aligned} W^2= & {} (\mathcal {N} \kappa )^2 = {1\over 2} \omega _{IJ} \omega ^{IJ} = {1\over 2} \omega _{AB} \omega ^{AB} \nonumber \\= & {} {1\over 2} \omega _{ab} \omega ^{ab} = {1\over 2} g^{ac} g^{bd} \omega _{ab} \omega _{cd} \end{aligned}$$
(13.164)

Finally, the spacetime vorticity takes the (expected) form

$$\begin{aligned} \omega _{ab} = \mathcal {N} u^c \kappa ^d \epsilon _{cdab} . \end{aligned}$$
(13.165)

We also have

$$\begin{aligned} \mathcal {L}_u \kappa _a= & {} \mathcal {L}_u \left( \psi _a^A \kappa _A \right) = \psi _a^A \mathcal {L}_u \kappa _A = \psi ^A_a u^c \partial _c \kappa _A\nonumber \\= & {} \psi ^A_a (u^c \psi ^B_c) {\partial \kappa _A\over \partial X^B } = 0 , \end{aligned}$$
(13.166)
$$\begin{aligned} u^b \nabla _b \mathcal {N}= & {} (u^b \tilde{\psi }^I_b) {\partial \mathcal {N} \over \partial \chi ^I} = 0 , \end{aligned}$$
(13.167)

as well as

$$\begin{aligned} \kappa ^a \nabla _a \mathcal {N}= & {} \kappa ^a \tilde{\psi }^I_a {\partial \mathcal {N} \over \partial \chi ^I} = \psi _A^a \kappa ^A \psi ^B_a \hat{\psi }^I_B {\partial \mathcal {N} \over \partial \chi ^I} \nonumber \\= & {} \kappa ^A \delta ^A_B \hat{\psi }^I_B {\partial \mathcal {N} \over \partial \chi ^I} = \kappa ^A \hat{\psi }^I_A {\partial \mathcal {N} \over \partial \chi ^I} = 0 . \end{aligned}$$
(13.168)

These results are quite intuitive. It is worth noting that

$$\begin{aligned} (u^a u_b - \hat{\kappa }^a \hat{\kappa }_b ) \nabla _a \mathcal {N} = 0 , \end{aligned}$$
(13.169)

and we also need to recall (13.140) and (13.141).

Let us now return to the equations of motion (13.146). If we consider an explicit model where \(\varLambda = \varLambda (n^2 , \mathcal {N} ^2)\), we have

$$\begin{aligned} {\partial \varLambda \over \partial \omega _{ab} } = {\partial \varLambda \over \partial \mathcal {N} ^2} {\partial \mathcal {N} ^2 \over \partial \omega _{ab} }= {\partial \varLambda \over \partial \mathcal {N} ^2} \omega ^{ab} = -{1\over 2} \lambda ^{ab} , \end{aligned}$$
(13.170)

and we arrive at

$$\begin{aligned} n^a\omega _{ab} = - {2 \over \kappa ^2} \omega _{ab} \nabla _c \left( {\partial \varLambda \over \partial \mathcal {N} ^2} \omega ^{ca}\right) = - {1 \over \kappa } \omega _{ab} \nabla _c \left( {\partial \varLambda \over \partial \mathcal {N} } {1\over \mathcal {N} \kappa } \omega ^{ca}\right) . \end{aligned}$$
(13.171)

Making use of (13.165) we then have

$$\begin{aligned}&{1 \over \kappa } \omega _{ab} \nabla _c \left( {\partial \varLambda \over \partial \mathcal {N} } {1\over \mathcal {N} \kappa } \omega ^{ca}\right) \nonumber \\&\quad = - \mathcal {N} \perp ^a_b \left[ \nabla _a \left( {\partial \varLambda \over \partial \mathcal {N} } \right) - {\partial \varLambda \over \partial \mathcal {N} } \left( \hat{\kappa }^c \nabla _c \hat{\kappa }_a - u^c \nabla _c u_a \right) \right] . \end{aligned}$$
(13.172)

Here it is worth noting that \(-\partial \varLambda /\partial \mathcal {N} \) is naturally interpreted as the energy per vortex (assuming that all vortices carry the same circulation and that the averaged energy is simply proportional to the vortex density. It is straightforward to make a connection with the “thin vortex” limit considered by Carter (2000) but we will not do so here.

Suppose that we also introduce a four-velocity associated with the matter flux, i.e. let

$$\begin{aligned} n^a = n u_\mathrm {n}^a , \end{aligned}$$
(13.173)

such that (as usual)

$$\begin{aligned} u_\mathrm {n}^a = \gamma ( u^a + v^a) , \quad u^a v_a = 0 , \quad \gamma = (1-v^2)^{-1/2} , \end{aligned}$$
(13.174)

We then have

$$\begin{aligned} n^a \omega _{ab} =n \gamma \mathcal {N} v^a \kappa ^d \epsilon _{dab} = n \gamma \mathcal {N} \epsilon _{bac} \kappa ^a v^c , \end{aligned}$$
(13.175)

which represents the Magnus force that acts on a set of vortices moving relative to a superfluid condensate (represented by \(n^a\)), cf. Eq. (13.81). Also recognizing the surface tension associated with vortex world sheet, we have the final equations of motion

$$\begin{aligned} \underbrace{n \gamma \epsilon _{bac} \kappa ^a v^c}_\mathrm {Magnus\ force} = \perp ^a_b \left[ \nabla _a \left( {\partial \varLambda \over \partial \mathcal {N}} \right) - \underbrace{ {\partial \varLambda \over \partial \mathcal {N}} \hat{\kappa }^c \nabla _c \hat{\kappa }_a + {\partial \varLambda \over \partial \mathcal {N}} u^c \nabla _c u_a }_\mathrm {surface\ tension} \right] . \end{aligned}$$
(13.176)

For completeness, we should also work out the stress-energy tensor for this model. This is fairly straightforward. With \(\varLambda = \varLambda (n^2, \mathcal {N}^2) = \varLambda (n_{abc}, \omega _{ab}, g^{ab})\) we need

$$\begin{aligned} {\partial \varLambda \over \partial \mathcal {N}^2 } \delta \mathcal {N}^2 ={1\over 2 \mathcal {N} \kappa ^2 } {\partial \varLambda \over \partial \mathcal {N}} \left( g^{cd} \omega _{ca} \omega _{db} \delta g^{ab} + \omega ^{bd} \delta \omega _{bd} \right) , \end{aligned}$$
(13.177)

leading to a contribution (using (13.165))

$$\begin{aligned} {\partial \varLambda \over \partial \mathcal {N}^2 } {\delta \mathcal {N}^2 \over \delta g^{ab}} = {1\over 2} \mathcal {N} {\partial \varLambda \over \partial \mathcal {N}} \perp _{ab} . \end{aligned}$$
(13.178)

Combining this with the previous (fluid) result, we have

$$\begin{aligned} T_{ab} = \left( \varLambda - n^c \mu _c \right) g_{ab} + n_a \mu _b - \mathcal {N} {\partial \varLambda \over \partial \mathcal {N}} \perp _{ab} . \end{aligned}$$
(13.179)

A direct calculation verifies that the divergence of this expression leads us back to (13.176).

We can extend the vortex model—following the steps from the Newtonian case—to account for mutual friction (Andersson et al. 2016). We may also consider the implications of the long-range nature of the vortex-vortex interaction, which implies that the vortex lattice has elastic properties (Baym and Chandler 1983; Chandler and Baym 1986; Andersson et al. 2020). In principle, this means that the vortex lattice supports a set of elastic oscillation modes known as Tkachenko modes (Sonin 2014). These were first proposed in the 1960s (Tkachenko 1966a, b), and have been discussed for superfluid helium, superfluid atomic condensates (Anglin and Crescimanno 2002; Fetter 2009) and neutron stars (Ruderman 1970; Noronha and Sedrakian 2008; Haskell 2011). The experimental verification of the idea is, however, quite recent (Coddington et al. 2003).

14 Perspectives on electromagnetism

Magnetic fields are ubiquitous in the Universe—electricity and magnetism are of obvious importance to our every day existence, and electromagnetism also plays a crucial role in astrophysics. In the context of general relativistic fluid dynamics, we are particularly interested in situations where strong gravity couples to charged flows. A typical example of such a problem would be two magnetized neutron stars crashing together at the end of a slow inspiral driven by the emission of gravitational radiation (Baiotti and Rezzolla 2017). Another interesting problem concerns ultra-relativistic jets associated with active galactic nuclei (and some stellar mass objects, as well), thought to be generated by the spin of the central object (via the so-called Blandford–Znajek mechanism; Blandford and Znajek 1977; MacDonald and Thorne 1982). Neutrons stars come into focus as the strongest known magnetic fields (above \(10^{14}\) G) are found in a subclass aptly referred to as magnetars (Thompson and Duncan 1993; Woods and Thompson 2006), systems that also form the largest (and hottest!) known superconductors (Page et al. 2011; Shternin et al. 2011). Magnetic fields are equally relevant on the vastly larger scale of entire galaxies, and are likely to have played a role in the early Universe as well (Ellis 1973; Ellis and van Elst 1999; Barrow et al. 2007). These are just a few—fairly obvious—examples that illustrate why we need to develop an understanding of the interaction between charged fluids (generating and maintaining the electromagnetic field) and relativistic gravity.

14.1 The Lorentz force

We laid the foundation for the covariant description of electromagnetism in Sect. 4.3 (see also Hobson et al. 2006). Starting from a suitable Lagrangian that couples the vector potential \(A_a\) (in the form of the Faraday tensor \(F_{a b}\)) to the four-current \(j^a\), we established that the electromagnetic field is governed by

$$\begin{aligned} \nabla _b F^{a b} = \mu _0 j^a . \end{aligned}$$
(14.1)

Moreover, since \(F_{a b}\) is anti-symmetric, it will automatically satisfy

$$\begin{aligned} \nabla _{[c} F_{a b]} = 0 . \end{aligned}$$
(14.2)

However, up to this point we had to take the claim that these equations describe electromagnetism on faith. In order for the model to make more intuitive sense, we need to make contact with the standard description in terms of the electric and magnetic fields and Maxwell’s equations.

This exercise is, in principle, straightforward, but at the same time one must tread carefully. In order to be consistent, we need to be mindful of the units of the various quantities involved. Unfortunately, the issue of units is somewhat thorny in electromagnetism. The underlying reason for this is that the theory involves two “coupling constants”, which we will call \(\mu _0\) and \(\epsilon _0\). We have already seen the first of these, and we know that it represents the strength of the coupling between the field and the current. As we will soon see, the second of the two coefficients represents the coupling to the charge density. The two coefficients combine in such a way that \(\mu _0 \epsilon _0 = 1/c^2\), defining the speed of light.Footnote 28 However, splitting this “constraint” involves an element of choice, which leads to different (perfectly consistent) sets of units. In fact, in his celebrated textbook Jackson (1975) makes the point that the two constants must be chosen arbitrarily. In the following, we will opt to work in (what is essentially) SI units, occasionally providing the “translation” to the Gauss units that are common in astrophysics.

Another issue that makes the problem non-trivial arises from the fundamental principle of electromagnetism; varying electric fields generate magnetic fields and vice versa. This implies that the decomposition into electric and magnetic fields must be observer dependent. If two observers move in different ways then they will observe different charge currents and therefore different fields.

According to an observer moving with four-velocityFootnote 29\(U^a\), the Faraday tensor takes the formFootnote 30

$$\begin{aligned} F_{ab} = 2 U_{[a} E_{b]} + \epsilon _{abcd}U^c B^d . \end{aligned}$$
(14.3)

This defines the electric and magnetic fields as

$$\begin{aligned} E_a = - U^b F_{ba} , \end{aligned}$$
(14.4)

and

$$\begin{aligned} B_a = - U^b \left( {1 \over 2} \epsilon _{abcd}F^{cd}\right) . \end{aligned}$$
(14.5)

The physical fields are both orthogonal to \(U^a\), so each has three components, just as in non-relativistic physics.

In the presence of a medium, we also need an expression for the charge current, and it is natural to decompose this in a similar way; namely,

$$\begin{aligned} j^a = \sigma U^a + J^a , \qquad \text{ where } \qquad J^a U_a = 0 . \end{aligned}$$
(14.6)

Intuitively, the electromagnetic field couples to the moving fluids through the Lorentz force. It is easy to see how this notion comes about. The overall stress-energy tensor for the system combines a “matter” part with the relevant electromagnetic contribution. The overall divergence has to vanish, as usual. This means that we can define the magnetic force \(f^a_\mathrm {L}\) as

$$\begin{aligned} \nabla _b T^{b a}_\mathrm {fluid} = - \nabla _b T^{b a}_\mathrm {EM} \equiv f^a_\mathrm {L} . \end{aligned}$$
(14.7)

Making use of the explicit stress-tensor for the electromagnetic field from Sect. 4;

$$\begin{aligned} T_{ab}^\mathrm {EM} = {1\over \mu _0} \left[ g^{cd}F_{ac}F_{bd}-{1\over 4}g_{ab}\left( F_{cd}F^{cd}\right) \right] . \end{aligned}$$
(14.8)

we find that

$$\begin{aligned} f^a_\mathrm {L} = j_b F^{ab} . \end{aligned}$$
(14.9)

Alternatively, making use of the decomposition into the electric and magnetic fields, we have

$$\begin{aligned} f^a_\mathrm {L} = \sigma E^a + \epsilon ^{a b c d} J_b U_c B_d + U^a \left( J_b E^b\right) . \end{aligned}$$
(14.10)

This exercise prompts a fundamental question. What exactly is the current \(j^a\)? Intuitively, we know the answer. A net current results from different charged components flowing relative to one another. However, the single-fluid picture that we have considered so far (with a single observer) does not consider this aspect. It only provides the final result, which is the charge current that is required to source the electromagnetic field. In order to understand the physics, we need to consider a system of coupled charged fluids. It is natural to do this by extending the variational approach to account for charged flows. Fortunately, this is straightforward and we will do this shortly. However, before going in this direction, let us convince ourselves that we have (indeed) a formulation that leads back to Maxwell’s equation.

14.2 Maxwell in the fluid frame

As a step towards making contact with applications, it is useful to consider the form of Maxwell’s equations in the fluid frame. That is, we introduce a fibration of spacetime associated with the fluid four velocity \(u^a\) (again, as in the discussion of the stress-energy tensor in Sect. 6). This leads to the formulation that is commonly used to discuss electromagnetism, especially in cosmology (Ellis 1973; Ellis and van Elst 1999; Barrow et al. 2007).

In order to write down Maxwell’s equation it is useful to introduce the general decomposition

$$\begin{aligned} \nabla _a u_b = \sigma _{ab} + \varpi _{ab} - u_a \dot{u}_b + {1 \over 3} \theta \perp _{ab} , \end{aligned}$$
(14.11)

where the co-moving time derivative leads to the four acceleration

$$\begin{aligned} \dot{u}^a = u^b \nabla _b u^a , \end{aligned}$$
(14.12)

(and similarly for other variables in the following). We also have the expansion scalar

$$\begin{aligned} \theta = \nabla _a u^a , \end{aligned}$$
(14.13)

the shear

$$\begin{aligned} \sigma _{ab} = \bar{D}_{\langle a}u_{b\rangle } , \end{aligned}$$
(14.14)

where the angle brackets indicate symmetrization and trace removal (as in (12.39)), and

$$\begin{aligned} \bar{D}_a u_b = \perp _a^{\ c} \perp _{b}^{\ d} \nabla _c u_d , \end{aligned}$$
(14.15)

is the fibration equivalent of the totally projected derivative we already introduced for spacetime foliations. The merit of using this (totally projected) derivative is that the individual terms in (14.11) are perpendicular to \(u^a\). We have also defined the vorticity

$$\begin{aligned} \varpi _{ab} = \bar{D}_{ [a}u_{b]} . \end{aligned}$$
(14.16)

Making use of these quantities, we find that (14.1) and (14.6) (with \(U^a\rightarrow u^a\)) lead to

$$\begin{aligned} \perp ^{ab} \nabla _b e_a = \nabla _a e^a - u_a \dot{e}^a = \mu _0 \sigma + \bar{\epsilon }^{abc} \varpi _{ab} b_c = \mu _0 \sigma + 2W^a b_a, \end{aligned}$$
(14.17)

where we use \(e^a\) and \(b^a\) for the electric and magnetic field in the fluid frame, respectively, in order to avoid confusion later. We have also defined the vector

$$\begin{aligned} W^{a} = {1 \over 2} \bar{\epsilon }^{abc} \varpi _{bc} , \quad \text{ so } \text{ that } \quad \varpi _{ab} = \bar{\epsilon }_{abc}W^c , \quad \text{ and } \quad u^a W_a = 0 , \end{aligned}$$
(14.18)

where

$$\begin{aligned} \bar{\epsilon }_{abc} = u^d \epsilon _{dabc} . \end{aligned}$$
(14.19)
figure v

Next we get

$$\begin{aligned} \perp _{ab}\dot{e}^b - \bar{\epsilon }_{abc} \bar{D}^b b^c + \mu _0 J_a = \left( \sigma _{ab} -\varpi _{ab} - {2\over 3} \theta \perp _{ab}\right) e^b + \bar{\epsilon }_{abc}\dot{u}^b b^c . \end{aligned}$$
(14.20)

The second set of equations follow from

$$\begin{aligned} \nabla _{[a}F_{bc]} = 0 , \end{aligned}$$
(14.21)

which leads to

$$\begin{aligned} \perp ^{ab}\nabla _b b_a = \bar{D}_a b^a = - 2W^a e_a, \end{aligned}$$
(14.22)

and

$$\begin{aligned} \perp _{ab}\dot{b}^b + \bar{\epsilon }_{abc}\bar{D}^b e^c = - \bar{\epsilon }_{abc}\dot{u}^b e^c + \left( \sigma _{ab} -\varpi _{ab} - {2\over 3} \theta \perp _{ab}\right) b^b . \end{aligned}$$
(14.23)

It is easy to see that, if we consider an inertial observer (simply ignoring all derivatives of the four velocity), these results reduce to the text-book form of Maxwell’s equations. The complete expressions given here are, however, useful as they highlight the coupling between the electromagnetic field and a given fluid flow (with shear, vorticity and expansion). This also makes the coupling to spacetime apparent (through the presence of the covariant derivative).

In the context of astrophysics, most models involve some version of magnetohydrodynamics. In effect, this involves assuming that the local electric field vanishes, or at least that the electric field contribution to (14.20) can be ignored, e.g., via a low velocity argument involving the characteristic length- and time-scales. In the non-relativistic setting this argument is not particularly controversial, although one may take the view that magnetohydrodynamics is more an assumption than an approximation (Schnack 2009).

Effectively, we assume \(e^a\approx 0\) which then implies that \(\sigma \approx 0\) and (14.20) reduces to

$$\begin{aligned} \mu _0 J_a \approx \bar{\epsilon }_{abc}\bar{D}^b b^c . \end{aligned}$$
(14.24)

Once we have a handle on the magnetic field and the fluid flow, we can work out the charge current. This leads to ideal magnetohydrodynamics. An alternative route to (basically) the same conclusions would be to start from a resistive model. The vanishing of the electric field then follows if the medium is assumed to be a perfect conductor, i.e. when the resistivity vanishes (or equivalently, the conductivity becomes infinite). However, this approach requires some version of Ohm’s law, so we will return to this later.

14.3 Variational approach for coupled charged fluids

The description of electromagnetism is, of course, not complete until we consider the coupling to the fluid medium. This is the point where the variational model comes to the fore. As we will now demonstrate; the advantage of having a well-grounded action principle for coupled fluids and an identification of the true momenta is that it is relatively easy to incorporate electromagnetism into the system. To do this, we extend the standard procedure of introducing a (minimal) gauge coupling between the matter and the Faraday field, already discussed in Sect. 4.3. The only difference is that we now consider multiple charge carriers with identifiable fluxes, \(n_{\mathrm {x}}^a\), and individual charges, \(q_{\mathrm {x}}\). The charge current (density) associated with each flow is

$$\begin{aligned} j^a_{\mathrm {x}}= q^{\mathrm {x}}n^a_{\mathrm {x}}, \end{aligned}$$
(14.25)

and the total current, that sources the electromagnetic field, is simply the sum

$$\begin{aligned} j^a = \sum _{\mathrm {x}}j_{\mathrm {x}}^a . \end{aligned}$$
(14.26)

It is worth noting that the variational derivation in Sect. 4.3 requires that the current is conserved. This constraint is automatically satisfied if each individual current is conserved, as assumed in the variational derivation. Hence, we simply change the electromagnetic Lagrangian to

$$\begin{aligned} L_\mathrm {EM} = - \frac{1}{4\mu _0} F_{a b} F^{a b} + A_a \sum _{\mathrm {x}}j^a_{\mathrm {x}}, \end{aligned}$$
(14.27)

and the equations that govern the electromagnetic field remain exactly as before. In addition, the gauge coupling leads to a modified fluid momentum

$$\begin{aligned} \bar{\mu }^{\mathrm {x}}_a = \mu ^{\mathrm {x}}_a + q^{\mathrm {x}}A_a , \end{aligned}$$
(14.28)

which satisfies the equations of motionFootnote 31

$$\begin{aligned} n^b_{\mathrm {x}}\bar{\omega }^{\mathrm {x}}_{b a} = 0 , \end{aligned}$$
(14.29)

where

$$\begin{aligned} \bar{\omega }^{\mathrm {x}}_{a b} = 2 \nabla _{[a} \bar{\mu }^{\mathrm {x}}_{b]} . \end{aligned}$$
(14.30)

Finally, the total stress-energy tensor takes the form

$$\begin{aligned} T^a{}_b = \varPsi \delta ^a{}_b + \sum _{\mathrm {x}}n^a_{\mathrm {x}}\mu ^{\mathrm {x}}_b - {1\over \mu _0} \left[ F^{c a}F_{c b} - {1\over 4} \delta ^a{}_b \left( F_{c d}F^{c d}\right) \right] , \end{aligned}$$
(14.31)

simply representing the sum of the fluid and the electromagnetic contributions.

figure w

As an alternative, we may consider writing the momentum equation (14.29) as a force-balance relation. Moving the electromagnetic contribution to the right-hand side, we get

$$\begin{aligned} n^b_{\mathrm {x}}\omega ^{\mathrm {x}}_{b a} = n_{\mathrm {x}}^b q^{\mathrm {x}}F_{a b} = j_{\mathrm {x}}^b F_{a b} \equiv f^{\mathrm {x}}_a . \end{aligned}$$
(14.33)

Making contact with the previous section, we have

$$\begin{aligned} f_\mathrm {L}^a = \sum _{\mathrm {x}}f_{\mathrm {x}}^a . \end{aligned}$$
(14.34)

It is also worth considering the four-current in more detail. Let us consider the current and charge density inferred by the fluid observer from above, moving with four-velocity \(u^a\). We can then express the various fluxes as

$$\begin{aligned} n_{\mathrm {x}}^a = n_{\mathrm {x}}\gamma _{\mathrm {x}}\left( u^a + v_{\mathrm {x}}^a\right) , \end{aligned}$$
(14.35)

where

$$\begin{aligned} \gamma _{\mathrm {x}}= \left( 1 - v_{\mathrm {x}}^2 \right) ^{-1/2} , \quad \text{ and } \quad v^{\mathrm {x}}_a u^a = 0 . \end{aligned}$$
(14.36)

It follows that the charge density \(\sigma \) used in the previous section takes the form;

$$\begin{aligned} \sigma = \sum _{\mathrm {x}}n_{\mathrm {x}}q^{\mathrm {x}}\gamma _{\mathrm {x}}\approx \sum _{\mathrm {x}}n_{\mathrm {x}}q^{\mathrm {x}}\end{aligned}$$
(14.37)

in the low-velocity limit. Meanwhile, the spatial components of the current are given by

$$\begin{aligned} j^i = \sum _{\mathrm {x}}j_{\mathrm {x}}^i = \sum _{\mathrm {x}}n_{\mathrm {x}}q^{\mathrm {x}}\gamma _{\mathrm {x}}v_{\mathrm {x}}^i \approx \sum _{\mathrm {x}}n_{\mathrm {x}}q^{\mathrm {x}}v_{\mathrm {x}}^i = J^i . \end{aligned}$$
(14.38)

For two-fluid systems, our analysis readily reproduces the results for electron-positron plasmas (Koide 2008, 2009; Kandus and Tsagas 2008). Moreover, the charged multi-fluid system can be extended to account for “non-ideal” effects like resistivity and particle reactions (i.e. non-conserved flows). In essence, if we want to account for resistivity, we need to add a phenomenological “force” term to (14.29). This additional term should describe the dissipative interaction between the two components, and the standard intuition (Schnack 2009; Bellan 2006) tells us that it should be linear in the relative velocity between the two components. We then see from (14.29) that the required force must be orthogonal to each respective flux (Andersson 2012) (note that this condition must be relaxed if we want to allow for particle creation/destruction).

Developments in this direction are (particularly) important for realistic neutron-star modelling. The most advanced step in this direction (Andersson et al. 2017b) considers a four-component system composed of neutrons (n), protons (p), electrons (e) and entropy (s). The relative flow of the protons and electrons leads to the charge current that couples the material motion to electromagnetism. The entropy flow is key if we want to account for the redistribution of heat, which we need to track if we want to consider (say) the cooling of a young neutron star. Finally, the neutrons need to be singled out, not just because they make up the bulk of the star but, as the star matures they become superfluid and (at least partially) decouple from the other components. In order to explore the evolution and dynamics of maturing neutron stars, one has to allow for the relative flows of these four components.

14.4 The foliation equations

We have seen how—once we introduce a fluid observer—the relativistic formulation for electromagnetism leads back to the, familiar looking, set of Maxwell’s equations. Let us now connect the description with the foliation approach from Sect. 11, as required if we want to carry out nonlinear simulations. For clarity, let us assume that we work with the electric and magnetic fieldsFootnote 32\(E^a\) and \(B^a\), now measured by an Eulerian observer (defined by the spacetime foliation, as usual). We then have

$$\begin{aligned} F_{ab} = 2 N_{[a} E_{b]} + \epsilon _{abcd}N^c B^d = 2 N_{[a} E_{b]} +\epsilon _{abd} B^d , \end{aligned}$$
(14.39)

where we have introducedFootnote 33

$$\begin{aligned} \epsilon _{abd} = \epsilon _{cabd}N^c . \end{aligned}$$
(14.40)

That is, the electric and magnetic fields measured in the Eulerian frame are

$$\begin{aligned} E_a = - N^b F_{ba} , \end{aligned}$$
(14.41)

and

$$\begin{aligned} B_a = - N^b \left( {1 \over 2} \epsilon _{abcd}F^{cd}\right) = {1\over 2} \epsilon _{acd} F^{cd}. \end{aligned}$$
(14.42)

Both fields are manifestly orthogonal to \(N^a\) so each has three components, as expected.

It is instructive to relate the fields to those associated with the fluid frame. We then need to first of all recall that

$$\begin{aligned} u^a = W(N^a + \hat{v}^a) , \end{aligned}$$
(14.43)

(where it is worth noting that we use hats to indicate fluid quantities observed in the frame associated with \(N^a\), as in Sect. 11), with W the relevant Lorentz factor. This means that we have

$$\begin{aligned} e_a= & {} - u^b F_{ba} = - W(N^b + \hat{v}^b) F_{ba} \nonumber \\= & {} W\left[ E_a +N_a (\hat{v}^b E_b)\right] - W \hat{v}^b \epsilon _{bad} B^d \nonumber \\= & {} W \left[ E_a + N_a (\hat{v}^b E_b) + \epsilon _{abc}\hat{v}^b B^c \right] , \end{aligned}$$
(14.44)

and

$$\begin{aligned} b_a= & {} - u^b \left( {1 \over 2} \epsilon _{abcd} F^{cd}\right) = - W(N^b +\hat{v}^b) \left( {1 \over 2} \epsilon _{abcd} F^{cd}\right) \nonumber \\= & {} W \left[ B_a +N_a ( \hat{v}^b B_b) - \epsilon _{abc} \hat{v}^b E^c \right] . \end{aligned}$$
(14.45)

It is evident from this expression that, in general, the electric field inferred by the local observer has a component parallel to \(N^a\)

$$\begin{aligned} e^\parallel = - e^a N_a = W \left( \hat{v}^b E_b\right) , \end{aligned}$$
(14.46)

as well as an orthogonal piece

$$\begin{aligned} e_a^\perp = W \left( E_a + \epsilon _{abc} \hat{v}^b B^c \right) . \end{aligned}$$
(14.47)

This is important. Let us assume that the observer can be chosen in such a way that the perpendicular component vanishes—the assumption that leads to ideal magnetohydrodynamics. That is, let

$$\begin{aligned} e_a^\perp = 0 \quad \Longrightarrow \quad E_a + \epsilon _{abc} \hat{v}^b B^c= 0 \end{aligned}$$
(14.48)

It is easy to see that this also means that \(e^\parallel =0\), so we actually have \(e^a=0\); the electric field vanishes according to the “fluid” observer. We need to keep this result in mind later.

Turning to the matter equations, rather than working with the divergence of the total stress-energy tensor for the system we can isolate the electromagnetic contribution. The right-hand side of the matter equations then have additional terms which follow from the Lorentz force

$$\begin{aligned} f_\mathrm {L}^a = - j_a F^{ab} = N^b (\hat{J}^a E_a) + ( \hat{\sigma } E^b + \epsilon ^{bac} \hat{J}_a B_c ) , \end{aligned}$$
(14.49)

where we have used the charge current

$$\begin{aligned} j^a = \hat{\sigma } N^a + \hat{J}^a . \end{aligned}$$
(14.50)

From this we see that we need to add, first of all, a term

$$\begin{aligned} \alpha \gamma ^{1/2} ( \hat{J}^i E_i) , \end{aligned}$$
(14.51)

to the right-hand side of (11.41), representing the electromagnetic contribution to the energy flow and including the Joule heating. Secondly, we need a term

$$\begin{aligned} \alpha \gamma ^{1/2} ( \hat{\sigma } E^i + \epsilon ^{ijk} \hat{J}_j B_k ) , \end{aligned}$$
(14.52)

on the right-hand side of (11.45), representing the (spatial) Lorentz force.

Finally, we need to add the foliation version of Maxwell’s equations to the evolution system. First of all, Eq. (14.1) leads to

$$\begin{aligned} \gamma ^{ab} \nabla _b E_a = \mu _0 \hat{\sigma } + \epsilon ^{abc}\left( \nabla _a N_b\right) B_c , \end{aligned}$$
(14.53)

or

$$\begin{aligned} \gamma ^{b}_a\nabla _b E^a - \mu _0 \hat{\sigma } = - \epsilon ^{abc}K_{ab} B_c = 0 , \end{aligned}$$
(14.54)

since \(K_{ab}\) is symmetric. That is, using the projected derivative \(D_a\) from Sect. 11 (not to be confused with \(\bar{D}_a\) from above), we have

$$\begin{aligned} D_i E^i = \mu _0 \hat{\sigma } . \end{aligned}$$
(14.55)

We also get

$$\begin{aligned}&\gamma _{ab} N^c\nabla _c E^b - \epsilon _{abc}\nabla ^b B^c + \mu _0 \hat{J}_a \nonumber \\&\quad = E^b \nabla _b N_a - E_a \nabla _b N^b + \epsilon _{abc}( N^d \nabla _d {N}^b) B^c \nonumber \\&\quad = - E^b K_{ba} + E_a K + \epsilon _{abc}( N^d \nabla _d {N}^b) B^c , \end{aligned}$$
(14.56)

and we end up with

$$\begin{aligned} \left( \partial _t - \mathcal {L}_\beta \right) E^i - \epsilon ^{ijk} D_j (\alpha B_k) + \alpha \mu _0 J^i = \alpha K E^i . \end{aligned}$$
(14.57)

The second pair of Maxwell equations follow from Eq. (14.2), which leads to

$$\begin{aligned} \gamma ^{ab}\nabla _b B_a = - \epsilon ^{abc} E_a \nabla _b N_c , \end{aligned}$$
(14.58)

or

$$\begin{aligned} \gamma ^{b}_a\nabla _b B^a = \epsilon ^{abc} E_a K_{bc} = 0 , \end{aligned}$$
(14.59)

so we have

$$\begin{aligned} D_i B^i = 0 . \end{aligned}$$
(14.60)

Finally,

$$\begin{aligned}&\gamma _{ab}N^c\nabla _c {B}^b + \epsilon _{abc}\nabla ^b E^c \nonumber \\&\quad = - \epsilon _{abc}(N^d \nabla _d N^b) E^c + B^b \nabla _b N_a - B_a \nabla _b N^b \nonumber \\&\quad = - \epsilon _{abc}(N^d \nabla _d N^b) E^c - B^b K_{ba} + B_a K , \end{aligned}$$
(14.61)

leads to

$$\begin{aligned} \left( \partial _t - \mathcal {L}_\beta \right) B^i + \epsilon ^{ijk} D_j (\alpha B_k) = \alpha K B^i . \end{aligned}$$
(14.62)

The four Maxwell equations can be written in different forms, depending on what is convenient. For example, in order to formulate a system suitable for numerical simulations it may be necessary to replace the covariant derivatives with partials, making the connections coefficients explicit (Dionysopoulou et al. 2013; Andersson et al. 2017c). However, such a reformulation does not add (much) to our understanding so we will settle for the equations in the present form.

14.5 Electron dynamics and Ohm’s law

So far we have not explored the multi-fluid aspects of the problem. These inevitably enter if we try to add features like resistivity. Then we have to consider the “friction” between the separate flows. From the multi-fluid point of view, we need to keep track of additional number densities. When these fluxes are conserved, we have

$$\begin{aligned} \nabla _a n_{\mathrm {x}}^a = 0 \ \Longrightarrow \ \left( \partial _t - \mathcal {L}_\beta \right) \left( \gamma ^{1/2} \hat{n}_{\mathrm {x}}\right) + D_i \left[ \gamma ^{1/2} \hat{n}_{\mathrm {x}}\left( \alpha \hat{v}_{\mathrm {x}}^i - \beta ^i \right) \right] = 0. \end{aligned}$$
(14.63)

It is fairly straightforward (if a bit messy) to write down the complete set of charged multi-fluid equations, representing a generic plasma setting. However, if we want to arrive at a set of equations representing “magnetohydrodynamics” we need to reduce the problem to (effectively) a single fluid degree of freedom. A natural step in this direction involves assuming that the relative flow between the different components in the system is modest enough that it can be represented as a linear drift. The idea is simple. Take the fluid frame (represented by \(u^a\)) to be associated with the baryons and let another component flow relative to it (with four velocity \(u_{\mathrm {x}}^a\)). In general, we then have

$$\begin{aligned} u_{\mathrm {x}}^a = \gamma _{\mathrm {x}}\left( u^a + v_{\mathrm {x}}^a \right) , \qquad u_a v_{\mathrm {x}}^a = 0 , \end{aligned}$$
(14.64)

where (as usual)

$$\begin{aligned} \gamma _{\mathrm {x}}= \left( 1 - v_{\mathrm {x}}^2 \right) ^{-1/2} . \end{aligned}$$
(14.65)

At this level—for each component that exhibits a relative flow (\(v_{\mathrm {x}}^a\ne 0\))—we need to keep track of the individual Lorentz factor (relative to the chosen observer), \(\gamma _{\mathrm {x}}\). To avoid this, we assume that the relative drift is slow enough that we can linearize the relations. In effect, we assume that \(\gamma _{\mathrm {x}}\approx 1\). This is an essential part of the “single fluid reduction” as we no longer need to keep track of the individual Lorentz factors. Moreover, it helps make contact with the thermodynamics and the equation of state.

To illustrate this point, note that the fluid observer measures each chemical potential as (introducing tildes to avoid confusion with the discussion in Sect. 2)

$$\begin{aligned} \tilde{\mu }_{\mathrm {x}}= - u^a \mu ^{\mathrm {x}}_a . \end{aligned}$$
(14.66)

If we ignore entrainment, then

$$\begin{aligned} \mu ^{\mathrm {x}}_a = \mu _{\mathrm {x}}u^{\mathrm {x}}_a \end{aligned}$$
(14.67)

so we need

$$\begin{aligned} \tilde{\mu }_{\mathrm {x}}= - \mu _{\mathrm {x}}( u^a u^{\mathrm {x}}_a ) . \end{aligned}$$
(14.68)

Within the linear drift model, it is straightforward to show that \(\tilde{\mu }_{\mathrm {x}}\approx \mu _{\mathrm {x}}\). Similarly, if we define the measured number density as

$$\begin{aligned} \tilde{n}_{\mathrm {x}}= - u_{\mathrm {x}}n_{\mathrm {x}}^a , \end{aligned}$$
(14.69)

then we also have \(\tilde{n}_{\mathrm {x}}\approx n_{\mathrm {x}}\). In essence, different fluid observers agree on both number densities and chemical potentials (Andersson et al. 2017b). This is crucial as it means that there is no ambiguity in the concept of chemical equilibrium. For the outer core of neutron star (for example) we need to consider the Urca reactions, so chemical equilibrium corresponds to

$$\begin{aligned} \beta = \mu _\mathrm {n}- \mu _\mathrm {p}- \mu _\mathrm{e}= - u^a \left( \mu ^\mathrm {n}_a - \mu ^\mathrm {p}_a - \mu ^\mathrm{e}_a \right) = 0 . \end{aligned}$$
(14.70)

As long as this condition is satisfied, we can consistently ignore reactions and assume that the different particle species are conserved. The situation would be much less clear if we allowed for a nonlinear drift. Different observers would measure different number densities/chemical potentials and determining the frame with which one should associate the thermodynamics becomes an issue.

Assuming that the linear drift argument holds on the evolution scale (as we have to in order to arrive at an effective one-fluid description) and translating to the point of view of an Eulerian observer it makes sense to assume that the difference between the two (three-) velocities \(\hat{v}_{\mathrm {x}}^a\) and \(\hat{v}^a\) is small, as well. Linearizing in the Eulerian velocity difference, we then have

$$\begin{aligned} W_{\mathrm {x}}= (1- \hat{v}_{\mathrm {x}}^2)^{-1/2} \approx W \left[ 1 + W^2 \hat{v}_a (\hat{v}_{\mathrm {x}}^a - \hat{v}^a ) \right] . \end{aligned}$$
(14.71)

Combining this with

$$\begin{aligned} u_{\mathrm {x}}^a = W_{\mathrm {x}}\left( N^a + \hat{v}_{\mathrm {x}}^a \right) \approx W \left( N^a + \hat{v}^a \right) +v_{\mathrm {x}}^a , \end{aligned}$$
(14.72)

we find that

$$\begin{aligned} v_{\mathrm {x}}^a \approx W\left[ \delta ^a_b + W^2 \hat{v}_b ( N^a +\hat{v}^a)\right] (\hat{v}_{\mathrm {x}}^b - \hat{v}^b ) , \end{aligned}$$
(14.73)

This shows that the linearization argument is consistent.

In the present case, where the focus is on charged flows if electrons and protons (say), we now have

$$\begin{aligned} \hat{\sigma }= & {} e (\hat{n}_\mathrm {p}- \hat{n}_\mathrm{e}) = e ( W_\mathrm {p}n_\mathrm {p}- W_\mathrm{e}n_\mathrm{e}) \nonumber \\= & {} e W \left[ \left( n_\mathrm {p}- n_\mathrm{e}\right) - W^2 n_\mathrm{e}\hat{v}_a (\hat{v}_\mathrm{e}^a - \hat{v}^a ) \right] , \end{aligned}$$
(14.74)

and

$$\begin{aligned} \hat{J}^a= & {} e ( \hat{n}_\mathrm {p}\hat{v}^a - \hat{n}_\mathrm{e}\hat{v}_\mathrm{e}^a) = e W \left[ n_\mathrm {p}\hat{v}^a - n_\mathrm{e}\hat{v}_\mathrm{e}^a - W^2 n_\mathrm{e}\hat{v}_b (\hat{v}_\mathrm{e}^b - \hat{v}^b ) \hat{v}_\mathrm{e}^a \right] \nonumber \\= & {} eW (n_\mathrm {p}- n_\mathrm{e}) \hat{v}^ a - eW n_\mathrm{e}(\hat{v}_\mathrm{e}^a-\hat{v}^a ) - eW^3 n_\mathrm{e}\hat{v}_b (\hat{v}_\mathrm{e}^b - \hat{v}^b) (\hat{v}_\mathrm{e}^a - \hat{v}^a + \hat{v}^a) \nonumber \\\approx & {} \hat{\sigma } \hat{v}^ a - eW n_\mathrm{e}(\hat{v}_\mathrm{e}^a-\hat{v}^a ) . \end{aligned}$$
(14.75)

That is,

$$\begin{aligned} \hat{v}_\mathrm{e}^a-\hat{v}^a \approx { 1\over e W n_\mathrm{e}} \left[ \hat{\sigma } \hat{v}^ a - \hat{J}^a \right] , \end{aligned}$$
(14.76)

where we have used the fact that the linear drift assumption leads to

$$\begin{aligned} \hat{n}_\mathrm{e}= n_\mathrm{e}W_\mathrm{e}\approx n_\mathrm{e}W \left[ 1 + W^2 \hat{v}_a \left( \hat{v}_\mathrm{e}^a-\hat{v}^a \right) \right] \approx n_\mathrm{e}W \left[ 1 - {\hat{\sigma } \over e n_\mathrm{e}} \right] . \end{aligned}$$
(14.77)

The momentum equation for a general component isFootnote 34

$$\begin{aligned}&\left[ \partial _t + (\alpha \hat{v}_{\mathrm {x}}^j - \beta ^j ) D_j \right] S^{\mathrm {x}}_i + S^{\mathrm {x}}_j D_i \left( \alpha \hat{v}_{\mathrm {x}}^j - \beta ^j \right) \nonumber \\&\quad + D_i \left[ \alpha \left( \hat{\mu }_{\mathrm {x}}- \hat{v}_{\mathrm {x}}^j S^{\mathrm {x}}_j\right) \right] = {\alpha \over \hat{n}_{\mathrm {x}}} \mathcal {F}^{\mathrm {x}}_i , \end{aligned}$$
(14.78)

where

$$\begin{aligned} \mathcal {F}^{\mathrm {x}}_i = e_{\mathrm {x}}\hat{n}_{\mathrm {x}}\left( E_i + \epsilon _{ijk} \hat{v}_{\mathrm {x}}^j B^k \right) + \gamma ^a_i R^{\mathrm {x}}_a , \end{aligned}$$
(14.79)

with the last term representing resistivity (implementing the model outlined by Andersson et al. 2017a).

Noting that (in absence of entrainment) we have

$$\begin{aligned} S_{\mathrm {x}}^i = \hat{\mu }_{\mathrm {x}}v_{\mathrm {x}}^i , \end{aligned}$$
(14.80)

and recalling (11.31)—that the fluid velocity is \(V_{\mathrm {x}}^i = \alpha \hat{v}_{\mathrm {x}}^i - \beta ^i\)—we see that (14.78) can be concisely written;

$$\begin{aligned} \left( \partial _t + \mathcal {L}_{V_{\mathrm {x}}}\right) S^{\mathrm {x}}_i + D_i \left( {\alpha \hat{\mu }_{\mathrm {x}}\over W_{\mathrm {x}}^2} \right) = {\alpha \over \hat{n}_{\mathrm {x}}} \mathcal {F}^{\mathrm {x}}_i , \end{aligned}$$
(14.81)

noting that the result relies on the linear drift assumption. In essence, we keep only linear terms in velocity differences in a frame determined by the global time coordinate. This means that

$$\begin{aligned} V_{\mathrm {x}}^a = V^a + \alpha ( \hat{v}_{\mathrm {x}}^a - \hat{v}^a ) . \end{aligned}$$
(14.82)
figure x

In the particular case of the electrons we then have

$$\begin{aligned}&\left[ \partial _t + (\alpha \hat{v}_\mathrm{e}^j - \beta ^j ) D_j \right] S^\mathrm{e}_i + S^\mathrm{e}_j D_i \left( \alpha \hat{v}_\mathrm{e}^j - \beta ^j \right) \nonumber \\&\quad + D_i \left[ \alpha \left( \hat{\mu }_\mathrm{e}- \hat{v}_\mathrm{e}^j S^\mathrm{e}_j\right) \right] = {\alpha \over \hat{n}_\mathrm{e}} \mathcal {F}^\mathrm{e}_i , \end{aligned}$$
(14.83)

where

$$\begin{aligned} S_\mathrm{e}^i = \hat{\mu }_\mathrm{e}\hat{v}_\mathrm{e}^i = \mu _\mathrm{e}W_\mathrm{e}\left[ \hat{v}^i + {1\over e n_\mathrm{e}W} \left( \hat{\sigma } \hat{v}^i - \hat{J}^i \right) \right] . \end{aligned}$$
(14.84)

Finally, we need an expression for the resistivity. From Andersson et al. (2017a, 2017b, 2017c) we have the general result (neglecting nuclear reactions, as we have assumed that the fluid remains in chemical equilibrium)

$$\begin{aligned} \gamma ^a_c R^{\mathrm {x}}_a = \gamma ^a_c \sum _{{\mathrm {y}}\ne {\mathrm {x}}} \mathcal {R}^{{\mathrm {x}}{\mathrm {y}}} \left( \delta ^b_a + v_{\mathrm {x}}^b u_a \right) w^{{\mathrm {y}}{\mathrm {x}}}_b , \end{aligned}$$
(14.85)

where the velocities are with respect to the fluid. In the linear drift model, these are related to the Eulerian velocities through (14.73). Thus, we arrive at

$$\begin{aligned} \gamma ^a_c R^{\mathrm {x}}_a = \sum _{{\mathrm {y}}\ne {\mathrm {x}}} \mathcal {R}^{{\mathrm {x}}{\mathrm {y}}} W \left( \delta ^b_a + W^2 \hat{v}^b \hat{v}_a \right) \left( \hat{v}^{\mathrm {y}}_b - \hat{v}^{\mathrm {x}}_b\right) . \end{aligned}$$
(14.86)

In the two-component case we are considering, this reduces to the intuitive relation

$$\begin{aligned} \gamma ^b_a R^ \mathrm {e}_b = \mathcal {R} W \left( \delta ^b_a +W^2 \hat{v}^b \hat{v}_a \right) \left( \hat{v}_b - \hat{v}^\mathrm{e}_b\right) = { \mathcal {R} \over e n_\mathrm{e}} \hat{J}_a \end{aligned}$$
(14.87)

It is worth noting that there are no \(\hat{\sigma }\) terms in the final expression.

Resistivity is usually implemented at the level of some version of Ohm’s law, typically viewed as a closure condition added to the magnetohydrodynamics relation (14.48). In the multi-fluid model, the required relation follows from the electron momentum equation (Andersson et al. 2017c). As a first step, let us assume that we can ignore the electron inertia. Then it follows from (14.81) that

$$\begin{aligned} \mathcal {F}^\mathrm{e}_i \approx - e n_\mathrm{e}W_\mathrm{e}\left( E_i + \epsilon _{ijk} \hat{v}_\mathrm{e}^j B^k \right) + { \mathcal {R} \over e n_\mathrm{e}} \hat{J}_a \approx { n_\mathrm{e}W_\mathrm{e}\over \alpha } D_i \left( {\alpha \mu _\mathrm{e}\over W_\mathrm{e}} \right) \end{aligned}$$
(14.88)

That is, we have

$$\begin{aligned} E_i + \epsilon _{ijk} \hat{v}_\mathrm{e}^j B^k + { 1 \over \alpha } D_i \left( {\alpha \mu _\mathrm{e}\over W_\mathrm{e}} \right) = { \mathcal {R} \over e n_\mathrm{e}^2 W_\mathrm{e}} \hat{J}_i \approx { \mathcal {R} \over e n_\mathrm{e}^2 W} \hat{J}_i \equiv \eta \hat{J}_i \end{aligned}$$
(14.89)

which defines the scalar resistivity coefficient \(\eta \). It is reassuring to note that (14.89) is consistent with the text-book result for non-relativistic two-fluid systems, e.g., Eq. (2.75) in Bellan (2006) or Mestel (1999), once we set \(\alpha = W_\mathrm{e}= W \rightarrow 1\) at the same time as we assume that \(\hat{\sigma } \rightarrow 0\).

Ignoring the chemical gradient term, we have

$$\begin{aligned} E_i + \epsilon _{ijk} \hat{v}^j B^k + {1\over e n_\mathrm{e}W} \epsilon _{ijk} \left( \hat{\sigma } \hat{v}^j - \hat{J}^j\right) B^k = \eta \hat{J}_i . \end{aligned}$$
(14.90)

Also neglecting (without particular justification at this point) the Hall term, we are left with

$$\begin{aligned} E_i + \epsilon _{ijk} \hat{v}^j B^k = \eta \hat{J}_i . \end{aligned}$$
(14.91)

Through a hierarchy of approximations and simplifications we have moved from a model that retains the properties of a charged two-component plasma to a simple expression for Ohm’s law.

The sequence of arguments leading to (14.91) provides insight into the applicability of “ideal” magnetohydrodynamics, which corresponds to the assumption that the local electric field vanishes

$$\begin{aligned} e^a\approx 0 \quad \Longrightarrow \quad E_i + \epsilon _{ijk} \hat{v}^j B^k =0 . \end{aligned}$$
(14.92)

The usual argument for this is that the medium is a perfect conductor, i.e. \(\mathcal{R}\rightarrow 0\) (\(\eta \rightarrow 0\)). However, this limit only affects the resistive term in (14.89). We still have to argue that the remaining terms are unimportant. This is less straightforward.

It is instructive to compare the final result to the standard argument from the literature (Bekenstein and Oron 1978; Watanabe and Yokoyama 2006; Palenzuela et al. 2009; Takamoto and Inoue 2011), which starts from magnetohydrodynamics and arrives at Ohm’s law by taking the current to be proportional to the Lorentz force acting on a particle in the fluid frame. Assuming

$$\begin{aligned} \perp _a^b j_b = \bar{\eta } F_{ab} u^b , \end{aligned}$$
(14.93)

and recalling that

$$\begin{aligned} u^a = W (N^a + \hat{v}^a) , \end{aligned}$$
(14.94)

we have

$$\begin{aligned} j_a= & {} \hat{\sigma } N_a + \hat{J}_a = \bar{\eta } W (N^b +\hat{v}^b) \left( N_{a} E_{b} - N_b E_a + \epsilon _{abc} B^c \right) \nonumber \\= & {} \bar{\eta } W \left[ N_{a} (\hat{v}^b E_b) + E_a + \epsilon _{abc} \hat{v}^b B^c \right] . \end{aligned}$$
(14.95)

Project along \(N^a\) to get

$$\begin{aligned} \hat{\sigma } + W^2 (\hat{v}_i\hat{J}^i - \hat{\sigma }) = \bar{\eta } W (\hat{v}^i E_i) , \end{aligned}$$
(14.96)

while the orthogonal projection leads to

$$\begin{aligned} \hat{J}_a - W^2 \hat{v}_a (\hat{\sigma } - \hat{v}_i\hat{J}^i) = \bar{\eta } W \left( E_a + \epsilon _{abc} \hat{v}^b B^c \right) . \end{aligned}$$
(14.97)

It follows that

$$\begin{aligned} \hat{v}^i \hat{J}_i - W^2 \hat{v}^2 (\hat{\sigma } - \hat{v}_i\hat{J}^i)= \bar{\eta } W (\hat{v}^i E_i) , \end{aligned}$$
(14.98)

and we finally arrive at

$$\begin{aligned} E_i + \epsilon _{ijk} \hat{v}^j B^k = {1\over \bar{\eta } W} \left[ \hat{J}_i - W^2 (\hat{\sigma } - \hat{v}_l\hat{J}^l) \hat{v}_i \right] . \end{aligned}$$
(14.99)

This version of Ohm’s law—notably identical to (14.91) once we identify \(\eta = 1/\bar{\eta } W\)—has been implemented in recent numerical simulations, see for example Eq. (22) in Palenzuela et al. (2009). The comparison provides a nice “sanity check” of the logic, but the multi-fluid derivation clearly provides a better understanding of the physics. Moreover, it allows us to extend the model to account for additional aspects (should we want to do so). In fact, if we were to retain the time variation of the charge current we would add in most of the relevant plasma features (the only restriction being that we assumed a linear drift fairly early on in the developments).

14.6 Tetrad formulation

The general formalism we have outlined is fully nonlinear and includes the coupling to the dynamical spacetime. In essence, it is geared towards numerical simulations of violent phenomena in full General Relativity. However, there are relevant problems where the dynamical role of spacetime is less crucial (or, perhaps, not at all relevant). A typical such problem would be the slow evolution of the magnetic field in a neutron star interior (Viganò et al. 2013). Assuming that we may take the spacetime as fixed, it can be useful to make the curved spacetime problem look “as close to flat” as possible. This typically involves using tetrads. As relevant parts of the literature draw on this strategy, it is useful to introduce the main ideas and steps here. We do this by adapting our magnetic field results to a fixed, slowly rotating spacetime. That is, we make contact with the Hartle-Thorne slow-rotation expansion (Hartle and Thorne 1968), keeping only first order terms in the rotation, for simplicity. The metric is then given by

$$\begin{aligned} ds^2 = - e^{2\nu } dt^2 - 2 \omega r^2 \sin ^2 \theta d\phi dt + e^{2\lambda } dr^2 + r^2 d\theta ^2 + r^2 \sin ^2\theta d\phi ^2 , \end{aligned}$$
(14.100)

where the rotational frame-dragging \(\omega \) is a solution to

$$\begin{aligned} {1\over r^3} {d\over dr} \left[ r^4 e^{-(\nu +\lambda )} {d\bar{\omega } \over dr}\right] + 4 {d\over dr}\left[ e^{-(\nu +\lambda )}\right] \bar{\omega } = 0 , \end{aligned}$$
(14.101)

with

$$\begin{aligned} \bar{\omega } = \varOmega - \omega . \end{aligned}$$
(14.102)

The solution external to a uniformly rotating body is

$$\begin{aligned} \bar{\omega }_\mathrm {ext} = \varOmega - {2J \over r^3} , \end{aligned}$$
(14.103)

where \(\varOmega \) is the rotation frequency of the star (as viewed by an asymptotic observer) and J is the angular momentum.

Comparing the slow-rotation line element to the 3+1 form from Eq. (11.5) we identify the lapse

$$\begin{aligned} \alpha = e^\nu , \end{aligned}$$
(14.104)

the shift vector

$$\begin{aligned} \beta ^i = -\omega \delta ^i_\phi , \end{aligned}$$
(14.105)

and the spatial metric

$$\begin{aligned} \gamma _{ij} = \left( \begin{array}{ccc} e^{2\lambda } &{} 0 &{} 0 \\ 0 &{} r^2 &{} 0 \\ 0 &{} 0 &{} r^2 \sin ^2\theta \end{array}\right) . \end{aligned}$$
(14.106)

The fact that \(\gamma _{ij}\) is diagonal simplifies much of the following discussion. We also see that

$$\begin{aligned} \gamma ^{1/2} = e^\lambda r^2 \sin \theta . \end{aligned}$$
(14.107)

Next, it is worth noting that

$$\begin{aligned} \alpha K = - \partial _t \ln \gamma ^{1/2} + D_i \beta ^i = 0 , \end{aligned}$$
(14.108)

since the spacetime is stationary and axisymmetric. We also have

$$\begin{aligned} \mathcal {L}_\beta \gamma ^{1/2} = \partial _i \left( \gamma ^{1/2} \beta ^i\right) = 0 , \end{aligned}$$
(14.109)

since the spacetime is axisymmetric. This means that

$$\begin{aligned} \left( \partial _t -\mathcal {L}_\beta \right) \gamma ^{1/2} = 0 , \end{aligned}$$
(14.110)

a result which will be used in the following.

Up to this point, we have expressed all tensor relations in terms of components in a given coordinate basis. However, when the focus is on measurements carried out by a given observer it may be helpful to work in a local inertial frame, using an orthonormal basis associated with a local tetrad (this is the ZAMO frame introduced by Bardeen et al. 1972; see also Thorne and MacDonald 1982). This means that we (first of all) translate the equations into an orthonormal tetrad—changing the basis in such a way that the metric appears flat. A simple way to do this is to rewrite the line element in terms of a new basis in such a way that (using hats to denote quantities in the new orthonormal basis)

$$\begin{aligned} ds^2 = \eta _{\hat{a} \hat{b}} dx^{\hat{a}} dx^{\hat{b}} = \eta _{\hat{a} \hat{b}} \omega ^{\hat{a}}_c \omega ^{\hat{b}}_d dx^c dx^d , \end{aligned}$$
(14.111)

where \(\eta _{\hat{a} \hat{b}} = \mathrm {diag}(-1,1,1,1)\). Comparing to the slow-rotation metric, we see that we have

$$\begin{aligned} \omega ^{\hat{0}}_a= & {} e^{\nu } ( 1,0,0,0), \end{aligned}$$
(14.112)
$$\begin{aligned} \omega ^{\hat{1}}_a= & {} e^{\lambda } ( 0,1,0,0) , \end{aligned}$$
(14.113)
$$\begin{aligned} \omega ^{\hat{2}}_a= & {} r( 0,0,1,0) , \end{aligned}$$
(14.114)
$$\begin{aligned} \omega ^{\hat{3}}_a= & {} r\sin \theta ( -\omega ,0,0,1) . \end{aligned}$$
(14.115)

If we define the inverse through

$$\begin{aligned} e _{\hat{c}}^a \omega ^{\hat{c}}_b = \delta ^a_b , \end{aligned}$$
(14.116)

it also follows that

$$\begin{aligned} e _{\hat{0}}^a= & {} e^{-\nu } ( 1,0,0,\omega ) , \end{aligned}$$
(14.117)
$$\begin{aligned} e _{\hat{1}}^a= & {} e^{-\lambda } ( 0,1,0,0) , \end{aligned}$$
(14.118)
$$\begin{aligned} e _{\hat{2}}^a= & {} {1\over r} ( 0,0,1,0) , \end{aligned}$$
(14.119)
$$\begin{aligned} e _{\hat{3}}^a= & {} {1\over r\sin \theta } ( 0,0,0,1) . \end{aligned}$$
(14.120)

The \(e_{\hat{a}}^b\) are usually referred to as the tetrad components.

We now have the tools we need to transform quantities from the coordinate basis to the orthonormal one. For instance;

$$\begin{aligned} B_{\hat{a}} = e_{\hat{a}}^b B_b , \end{aligned}$$
(14.121)

and

$$\begin{aligned} B^{\hat{a}} = \omega ^{\hat{a}}_b B^b . \end{aligned}$$
(14.122)

An advantage of working in the orthonormal tetrad is that we can exhange co- and contravariant quantities without “penalty” (as the associated three-metric is flat). A disadvantage is that we have to be careful with derivatives. Before we consider this issue, let us provide an example of why it is natural to work with the tetrad components of the various spatial objects. Let us take the Faraday tensor as example. First of all, according to an observer rotating with \(\omega \) we have the coordinate basis result (see, for instance, Rezzolla et al. 2001)

$$\begin{aligned} F_{ab} = \left( \begin{array}{cccc} 0 &{} - e^\nu E_r - \omega e^\lambda r^2 \sin \theta B^\theta &{} - e^\nu E_\theta + \omega e^\lambda r^2 \sin \theta B^r &{} - e^\nu E_\phi \\ e^\nu E_r + \omega e^\lambda r^2 \sin \theta B^\theta &{} 0 &{} e^\lambda r^2 \sin \theta B^\phi &{} - e^\lambda r^2 \sin \theta B^\theta \\ e^\nu E_\theta - \omega e^\lambda r^2 \sin \theta B^r &{} - e^\lambda r^2 \sin \theta B^\phi &{} 0 &{} e^\lambda r^2 \sin \theta B^r \\ e^\nu E_\phi &{} e^\lambda r^2 \sin \theta B^\theta &{} - e^\lambda r^2 \sin \theta B^r &{} 0 \end{array}\right) . \end{aligned}$$
(14.123)

If we simply replace the field components with the corresponding quantities for the tetrad and project the tensor into the tetrad we get

$$\begin{aligned} F_{\hat{c} \hat{d}} = e_{\hat{c}}^a e _{\hat{d}}^b F_{ab} = \left( \begin{array}{cccc} 0 &{} - E^{\hat{r}} &{} - E^{\hat{\theta }} &{} - E^{\hat{\phi }} \\ E^{\hat{r}} &{} 0 &{} B^{\hat{\phi }} &{} - B^{\hat{\theta }} \\ E^{\hat{\theta }} &{} - B^{\hat{\phi }}&{} 0 &{} B^{\hat{r}} \\ E^{\hat{\phi }} &{} B^{\hat{\theta }}&{} - B^{\hat{r}} &{} 0 \end{array}\right) . \end{aligned}$$
(14.124)

We recognize this as the usual flat-space form of the Faraday tensor, emphasizing that this is the natural description for a local observer.

As we move on to explore dynamics, we have to consider derivatives. For scalar quantities, this is relatively straightforward. For example, from (14.121) we see that

$$\begin{aligned} \mathbf {e}_{\hat{0}} = \partial _\tau = e_{\hat{0}}^a \mathbf {e}_a = e^{-\nu } \left( \mathbf {e}_t + \omega \mathbf {e}_\phi \right) = e^{-\nu } \left( \partial _t + \omega \partial _\phi \right) , \end{aligned}$$
(14.125)

allows us to introduce a natural time-derivative associated with the rotating frame. In fact, for a scalar n, we have

$$\begin{aligned} (\partial _t - \mathcal {L}_\beta ) (\gamma ^{1/2} n)= & {} \gamma ^{1/2} (\partial _t - \mathcal {L}_\beta ) n \nonumber \\= & {} \gamma ^{1/2} (\partial _t - \beta ^j \nabla _j ) n = \gamma ^{1/2} (\partial _t - \beta ^j \partial _j ) n \nonumber \\ \!= & {} \! \gamma ^{1/2} (\partial _t \!+\! \omega \partial _\phi ) n \!=\! \gamma ^{1/2} e^\nu \partial _\tau n \!=\! \gamma ^{1/2} \partial _\tau \left( e^\nu n\right) .\qquad \end{aligned}$$
(14.126)

However, this is more of an aside because, for vector quantities this is not the appropriate time derivative. In order to understand the distinction, we need to reinstate the basis vectors (and forms). Using bold vectors (adding tildes for basis one forms) we have the three-vector

$$\begin{aligned} \varvec{B} = B^b \mathbf {e}_b = e _{\hat{c}}^a \omega ^{\hat{c}}_b B^b \mathbf {e}_a = B^{\hat{c}} \mathbf {e}_{\hat{c}} . \end{aligned}$$
(14.127)

If we want to make a connection with (more or less) text-book vector calculus, we need to understand derivatives of vectors in the ZAMO frame. First of all, we note that the (spatial) metric \(\gamma _{ij}\) is diagonal (in fact, in 3D we can always find coordinates that lead to a diagonal metric) with scale factors \(h_a\) given by (we are not summing over repeated indices for the rest of this section!);

$$\begin{aligned} \mathbf {e}_{\hat{a}} = {1\over h_a } {\partial \over \partial x^a} = {1\over h_a } \mathbf {e}_a = e_{\hat{a}}^a \mathbf {e}_a \Longrightarrow e_{\hat{a}}^a = {1\over h_a } . \end{aligned}$$
(14.128)

Comparing (for later convenience) to the three-metric we see that

$$\begin{aligned} \gamma _{ac} = h_c^2 \delta _{ac} . \end{aligned}$$
(14.129)

Let us now define

$$\begin{aligned} \varvec{\nabla } =\sum _a \tilde{\mathbf {e}}^a D_a , \end{aligned}$$
(14.130)

such that the directional derivative is given by

$$\begin{aligned} D_a = \mathbf {e}_{a} \cdot \varvec{\nabla } , \end{aligned}$$
(14.131)

and we have

$$\begin{aligned} \varvec{\nabla } \omega= & {} \sum _a \tilde{\mathbf {e}}^a D_a \omega = \sum _a \tilde{\mathbf {e}}^a \partial _a \omega = \sum _{a,b,c} e _{\hat{c}}^a \omega ^{\hat{c}}_b \tilde{\mathbf {e}}^b \partial _a \omega \nonumber \\= & {} \sum _{a,c} \tilde{\mathbf {e}}^{\hat{c}} \left( e^a_{\hat{c}} \partial _a \omega \right) = \sum _a \tilde{\mathbf {e}}^{\hat{a}} \left( {1\over h_a } \partial _a \omega \right) . \end{aligned}$$
(14.132)

We see that we can express the components of the gradient in either frame, but in the orthonormal case we need to keep track of the scale factors. We obviously knew this already, but we can now make the connection explicit.

Turning to vectors, we have (the usual covariant derivative)

$$\begin{aligned} \varvec{\nabla } \varvec{A}= & {} \sum _{a,b}\tilde{\mathbf {e}}^a D_a ( A^b \mathbf {e}_b) = \sum _{a,b}\left[ (\partial _a A^b ) \tilde{\mathbf {e}}^a \mathbf {e}_b + A^b \tilde{\mathbf {e}}^a D_a \mathbf {e}_b \right] \nonumber \\= & {} \sum _{a,b,c} \left[ (\partial _a A^b ) \tilde{\mathbf {e}}^a \mathbf {e}_b + A^b \tilde{\mathbf {e}}^a \varGamma _{ba}^c \mathbf {e}_c \right] \equiv \sum _{a,b} (D_a A^b ) \tilde{\mathbf {e}}^a \mathbf {e}_b , \end{aligned}$$
(14.133)

where \( \varGamma _{ba}^c\) is the connection associated with \(\gamma _{ij}\). We also have

$$\begin{aligned} \varvec{\nabla } \varvec{A}= & {} \sum _{a,b}\tilde{\mathbf {e}}^a D_a ( A^{\hat{b}} \mathbf {e}_{\hat{b}}) = \sum _{a,b}\left[ \partial _a \left( { A^{\hat{b}} \over h_b} \right) \tilde{\mathbf {e}}^a \mathbf {e}_b + {A^{\hat{b}} \over h_b} \tilde{\mathbf {e}}^a D_a \mathbf {e}_b \right] \nonumber \\= & {} \sum _{a,b} D_a \left( {A^{\hat{b}} \over h_b}\right) \tilde{\mathbf {e}}^a \mathbf {e}_b= \sum _{a,b} {h_b \over h_a} D_a \left( {A^{\hat{b}} \over h_b}\right) \tilde{\mathbf {e}}^{\hat{a}} \mathbf {e}_{\hat{b}} , \end{aligned}$$
(14.134)

and it follows that

$$\begin{aligned} D_{\hat{a}} A^{\hat{c} }= {h_c \over h_a} D_a \left( {A^{\hat{c}} \over h_c} \right) = \sum _b {h_c \over h_a} \left[ \partial _a \left( {A^{\hat{c}} \over h_c} \right) + \varGamma _{ba}^c \left( {A^{\hat{b}} \over h_b} \right) \right] . \end{aligned}$$
(14.135)

Now, as \(\gamma _{ij}\) is diagonal in the particular case we are considering (and likely in any problem one may be interested in), we have

$$\begin{aligned} \varGamma ^c_{ab}= & {} \sum _d \gamma ^{cd} \left[ ( h_d \partial _b h_d) \delta _{ad}+ ( h_d \partial _a h_d) \delta _{bd} -(h_b \partial _d h_b) \delta _{ab}\right] \nonumber \\= & {} {1\over h_c^2} \left[ ( h_c \partial _b h_c) \delta ^c_a+ ( h_c \partial _a h_c) \delta ^c_b - \sum _d (h_b \partial _d h_b) \delta ^{cd} \delta _{ab}\right] . \end{aligned}$$
(14.136)

Using this in (14.135) we arrive at

$$\begin{aligned} D_{\hat{a}} A^{\hat{c} }= & {} {h_c \over h_a} {\partial \over \partial x^a} \left( { A^{\hat{c}} \over h_c} \right) \nonumber \\&\qquad + \sum _{b,d} { 1 \over h_a h_c } \left[ (h_c \partial _b h_c )\delta ^c_{a}+ (h_c \partial _a h_c) \delta ^c_{b} - (h_b \partial _d h_b)\delta ^{cd} \delta _{ab}\right] { A^{\hat{b}} \over h_b} \nonumber \\&\quad = {1\over h_a} \partial _{a} A^{\hat{c}} - \sum _d \delta ^{cd} {1 \over h_a h_c} (\partial _d h_a) A^{\hat{a}}+ \sum _{b} { 1 \over h_a h_b } ( \partial _b h_c) \delta _{a}^c A^{\hat{b}}.\nonumber \\ \end{aligned}$$
(14.137)

For the divergence we then need

$$\begin{aligned} \varvec{\nabla } \cdot \varvec{B}\equiv & {} \sum _a D_{ a} B^{ a } = \sum _a D_{\hat{a}} B^{\hat{a} } \nonumber \\= & {} \sum _a {1\over h_a} \partial _{a} B^{\hat{a}} - \sum _a {1 \over h_a^2} \partial _a h_a B^{\hat{a}}+ \sum _{a,b} { 1 \over h_a h_b } \partial _b h_a B^{\hat{b}} \nonumber \\= & {} \sum _a {1\over h_a} \partial _{a} B^{\hat{a}} - {1 \over h_1^2} \partial _1 h_1 B^{\hat{1}} - {1 \over h_2^2} \partial _2 h_2 B^{\hat{2}} - {1 \over h_3^2} \partial _3 h_3 B^{\hat{3}} \nonumber \\&+ \sum _{a} \left[ { 1 \over h_a h_1 } \partial _1 h_a B^{\hat{1}} + { 1 \over h_a h_2 } \partial _2 h_a B^{\hat{2}}+ { 1 \over h_a h_3 } \partial _3 h_a B^{\hat{3}} \right] \nonumber \\= & {} {1\over h_1 h_2 h_3} \left[ {\partial \over \partial x^1} (h_2 h_3 B^{\hat{1}}) + {\partial \over \partial x^2} (h_1 h_3 B^{\hat{2}}) + {\partial \over \partial x^3} (h_1 h_2 B^{\hat{3}}) \right] ,\nonumber \\ \end{aligned}$$
(14.138)

which is the textbook result.

Similarly, it is straightforward to use (14.135) to show that we have the standard result for the curl:

$$\begin{aligned} \varvec{\nabla } \times \varvec{B}= & {} \sum _{a,b,c} \mathbf {e}_{\hat{a}}( \epsilon ^{\hat{a} \hat{b} \hat{c}} \nabla _{\hat{b}} B_{\hat{c}}) \nonumber \\= & {} \sum _{a,b,c} \mathbf {e}_{\hat{a}} \omega ^{\hat{a}}_b ( \epsilon ^{bcd} \partial _c B_d ) = { 1 \over h_1 h_2 h_3} \left| \begin{array}{ccc} h_1 \mathbf {e}_{\hat{1}}&{} h_2 \mathbf {e}_{\hat{2}} &{} h_3 \mathbf {e}_{\hat{3}} \\ \partial _r &{} \partial _\theta &{} \partial _\phi \\ h_1 B_{\hat{1}} &{} h_2 B_{\hat{2}} &{} h_3 B_{\hat{3}} \end{array} \right| . \end{aligned}$$
(14.139)

Finally, we need time derivatives

$$\begin{aligned} \sum _a \mathbf {e}_a \partial _t B^a = \partial _t \varvec{B} , \end{aligned}$$
(14.140)

and

$$\begin{aligned} \sum _a \mathbf {e}_a ( \mathcal {L}_\beta B^a )= & {} \sum _{a,b} \mathbf {e}_a \left( \beta ^b \partial _b B^a - B^b \partial _b \beta ^a \right) \nonumber \\= & {} \mathbf {e}_a \left( \beta ^b D_b B^a - B^b D _b \beta ^a \right) = (\varvec{\beta } \cdot \varvec{\nabla }) \varvec{B} - (\varvec{B} \cdot \varvec{\nabla }) \varvec{\beta },\nonumber \\ \end{aligned}$$
(14.141)

where

$$\begin{aligned} \varvec{\beta } = - \omega \sum _a \delta ^a_\phi \mathbf {e}_a = - \omega \sum _a \delta ^{\hat{a}}_\phi \mathbf {e}_{\hat{a}} = - \omega \varvec{n}_\phi . \end{aligned}$$
(14.142)

Thus, we see that

$$\begin{aligned} \sum _a \mathbf {e}_a ( \partial _t B^a - \mathcal {L} _\beta B^a ) = \partial _t \varvec{B} - (\varvec{\beta } \cdot \varvec{\nabla }) \varvec{B} + (\varvec{B} \cdot \varvec{\nabla }) \varvec{\beta } \end{aligned}$$
(14.143)

This is all we need if we want to write various coordinate basis Maxwell equations in terms of three-vectors. As a start, consider (14.60). It is easy to see that, the scale factors associated with the spherical coordinates are \(h_1 = e^\lambda \), \(h_2 =r\) and \(h_3=r \sin \theta \), and it follows immediately that

$$\begin{aligned} \varvec{\nabla } \cdot \varvec{B} = 0 . \end{aligned}$$
(14.144)

Continuing in the spirit of making the equation look as close to the flat-space case as possible, we introduce the charge density as \(\hat{\sigma } = J^{\hat{t}}\). Then (14.55) is

$$\begin{aligned} \varvec{\nabla } \cdot \varvec{E} = 4 \pi \hat{\sigma } . \end{aligned}$$
(14.145)

The time-dependent equations are a little bit messier, partly because the redshift factor \(e^\nu \) needs to be accounted for (see Thorne and MacDonald 1982 for discussion). Thus, we can write (14.62) as

$$\begin{aligned} \partial _t \varvec{B} - (\varvec{\beta } \cdot \varvec{\nabla }) \varvec{B} + (\varvec{B} \cdot \varvec{\nabla }) \varvec{\beta } + \varvec{\nabla } \times (e^\nu \varvec{E}) = 0 . \end{aligned}$$
(14.146)

Similarly, once we define

$$\begin{aligned} \varvec{J} = \sum _a J^{\hat{a}} \mathbf {e} _{\hat{a}} , \end{aligned}$$
(14.147)

Equation (14.57) becomes

$$\begin{aligned} \partial _t \varvec{E} - (\varvec{\beta } \cdot \varvec{\nabla }) \varvec{E} + (\varvec{E} \cdot \varvec{\nabla }) \varvec{\beta } - \varvec{\nabla } \times \left( e^\nu \varvec{B} \right) = - 4\pi e^\nu \varvec{J} . \end{aligned}$$
(14.148)

The different relations agree (as they have to) with Eqs. (20)–(23) from Khanna and Camenzind (1996).

14.7 A brief status report of magnetic field models

Problems in astrophysics and cosmology involving magnetic fields are of obvious interest due to the (essentially) direct link to observation. Most objects of interest for astronomy tend to be endowed with magnetic fields and the large scale fields may have an impact on cosmology, as well. Quite naturally, this means that the literature on the subject is vast and varied. We will not be able to give the different issues the attention they deserve, but it nevertheless makes sense to list some of the main issues that (may) require fully relativistic description of non-ideal magnetohydrodynamics. Of most obvious relevance are problems involving not only electromagnetism but the live spacetime of General Relativity. Key gravitational-wave sources immediately come to mind, like core-collapse supernovae (Takiwaki and Kotake 2011) and compact binary mergers (Chawla et al. 2010; Rezzolla et al. 2011; Ruiz et al. 2016, 2019, 2020). Both cases involve strong gravity, a significant thermal component and magnetic fields. Going beyond ideal magnetohydrodynamics in these simulations is, however, challenging both from a technical point of view and in view of the computational cost. This obviously does not mean that we should set our aim high—indeed, there have been several efforts in this direction (Watanabe and Yokoyama 2006; Palenzuela et al. 2009; Takamoto and Inoue 2011; Dionysopoulou et al. 2013)—but it is probably fair to say that this is work in progress. The step to a full plasma description and actual multi-fluid simulations (Zenitani et al. 2009) is also unlikely to be taken any time soon.

The seemingly more innocuous problem of isolated compact stars also comes with unresolved issues. These range from the dynamics of the star’s magnetosphere and the pulsar emission mechanism to the formation and evolution of the star’s interior magnetic field. In the case of the magnetosphere, the main focus has been on force-free models (see Pétri 2019 for a connection to the recent literature), but recent arguments (Li et al. 2012) point to the need to account for resistivity. In the case of the formation and evolution of a compact star’s global magnetic field, we need a better understanding of dynamo effects that may come into operation (see Thompson and Duncan 1993 and also Brandenburg and Subramanian 2005 for a recent review) and we also need to understand the coupled evolution of the star’s spin, temperature and magnetic field (Viganò et al. 2013). There are difficult issues to resolve, especially since it is becoming clear that the typical stationary and axisymmetric magnetic field models one would intuitively use as a starting point for the discussion tend to be unstable (Lander and Jones 2012).

In fact, it is clear that we need to develop the theory further. Typical issues that need to be addressed involve (i) the dynamics of the model, e.g., causality and stability of wave propagation and relation to issues like pulsar emission or the launch of outflows and jets, (ii) transitions between spatial regions where different simplifying assumptions are valid, such as a region in the magnetosphere where the fluid model applies and a low density region where the description breaks down and one would have to fall back on a kinetic theory description (Marklund et al. 2003; Meier 2004; Gedalin 1996), the transition from the magnetosphere to the interior field at the star’s surface or, indeed, accreting systems where an ion-electron plasma describes the inflowing matter while regions in the magnetosphere may still be appropriately modelled as a pair-plasma, (iii) the role of more complex physics, like the superconductor that is expected to be present in the star’s core (Glampedakis et al. 2011) or regions where the assumption that the medium is electromagnetically “passive” does not apply, possibly in the pasta region near the crust-core transition (Pons et al. 2013).

Another problem of key astrophysical interest concerns the launch of large-scale jet emission—either associated with core collapse or neutron star mergers—required to explain observed gamma-ray bursts (Rezzolla et al. 2011). The difficulties here remain technical and conceptual, with one of the main issues being the need to resolve the dynamics of the central engine (e.g., associated with the magnetorotational instability; Balbus and Hawley 1991; Hawley and Balbus 1991; Kiuchi et al. 2018) while at the same time representing the large scale behaviour of the jet emission. One of the key challenges involves marrying the nonlinear dynamics of the strong-gravity central region with the evolution in the distant weak field region (where one may get away with treating spacetime as a fixed background, the typical assumption for jet simulations; Uzdensky and MacFadyen 2007; Krolik and Hawley 2010; Xie et al. 2018).

15 The problem with heat

The fact that relativistic fluid dynamics is a mature field of study does not mean that there are no unresolved issues. In fact, there are quite a few. Some continue to be in focus and others are swept under the rug (perhaps to be rediscovered, cause confusion and then duly ignored again...) One of the main issues that continue to cause concern arises as soon as we consider dissipative systems. It is clear from the outset that we are facing a difficult problem. For example, the familiar Fourier theory for heat conduction—which requires the introduction of thermal conductivity associated with the mobility of entropy carriers—leads to instantaneous propagation of thermal signals (the heat equation is parabolic). The fact that this non-causality is built into the description is unattractive already in the context of the classic Navier–Stokes equations. Intuitively, one would expect heat to propagate at roughly the mean molecular speed in the system. For a relativistic description non-causal behavior would be totally unacceptable. Any acceptable formulation of the problem must circumvent this. In principle, we know what we have to do. There is a deep connection between causality, stability, and hyperbolicity of a dissipative model (Hiscock and Lindblom 1983), so we need to make sure that we develop a fully hyperbolic formalism. The issue has been a main motivating factor behind the development of extended irreversible thermodynamics (Jou et al. 1993; Müller and Ruggeri 1993), a model which introduces additional dynamical fields in order to retain hyperbolicity and causality.

From a formal point of view the debate has (at least to some extent) been settled since the late 1970s. The key contribution was the work of Israel and Stewart, who developed a model analogous to Grad’s 14-moment theory, taking as its starting point relativistic kinetic theory (Stewart 1977; Israel and Stewart 1979a, b). This so-called “second order” theory, extends the pioneering “first order” work of Eckart (1940) and Landau and Lifshitz (1959), has been used in a number of different settings, including the highly relativistic plasmas generated in colliders like RHIC at Brookhaven and the LHC at CERN (Elze et al. 2001; Muronga 2004). However, despite the obvious successes of the second-order model, there are still dissenting views in the literature, see for example García-Colín and Sandoval-Villalbazo (2006), García-Perciante et al. (2009b). Particular objections concern the complexity of the formulation and the many additional “dissipation coefficients” required to complete it. This is, however, a feature that is shared by all models within the extended thermodynamics framework (Jou et al. 1993).

The simplest relevant problem involves heat flow, a problem with several interesting aspects and which also connects with fundamental physics questions, in particular in the context of nonlinear phenomena,see for example Morro and Ruggeri (1987), Ruggeri et al. (1996), Jou et al. (2004), Lebon et al. (2008) and Llebot et al. (1983). Non-linearities are relevant for the development of both shocks and turbulence in real physical systems. However, at this point we aim to establish the viability of the multi-fluids approach to the heat problem. For this purpose, a linear analysis should be adequate. If we dig deeper we uncover a range of issues, including foundational problems like the nature of time (read: the role of the second law of thermodynamics) and the formation of structures at nonlinear deviations from thermal equilibrium. Much recent work has been motivated by the modelling of complex systems is astrophysics and cosmology (Maartens 1996). The problem may date back to the origins of relativity theory (Landsberg 1967)—is a moving body hot or cold?—but it remains an active challenge.

15.1 The “standard” approach

In order to illustrate the main principles, let us return to a situation we have considered several times already. Adding a thermal component to a single matter component, we envisage two distinct flows. The matter is represented by a flux \(n^a\) which satisfies

$$\begin{aligned} \nabla _a n^a = 0 , \qquad \mathrm {where} \qquad n^a = n u^a . \end{aligned}$$
(15.1)

In the following (in order to be specific) we will work in the frame associated with the matter flow, \(u^a\). Next we add the heat flux \(q^a\) (which is spatial in the sense that \(u^a q_a=0\)) to the perfect fluid stress-energy tensor:

$$\begin{aligned} T^{a b} = \varepsilon u^a u^b + p \perp ^{ab}+ 2 q^{(a} u^{b)} . \end{aligned}$$
(15.2)

Finally, we need to incorporate the second law of thermodynamics. The requirement that the total entropy must not decrease leads to the entropy flux \(s^a\) having to be such thatFootnote 35

$$\begin{aligned} \nabla _a s^a = \varGamma _\mathrm {s}\ge 0 . \end{aligned}$$
(15.3)

Assuming that the entropy flux is a combination of the available fluxes, we have (Eckart 1940) (we will connect this relation with the variational derivation later)

$$\begin{aligned} s^a = s u^a + \beta q^a , \end{aligned}$$
(15.4)

where \(\beta \) is yet to be specified. It is easy to work out the divergence of this, and we find (after introducing \(x_\mathrm {s}= s/n\), as before, and using (15.1))

$$\begin{aligned} n u^a \nabla _a x_\mathrm {s}+ \beta \nabla _a q^a + q^a \nabla _a \beta = \varGamma _\mathrm {s}\end{aligned}$$
(15.5)

Next, we combine this result with

$$\begin{aligned} u_a \nabla _b T^{a b} = 0 , \end{aligned}$$
(15.6)

and the thermodynamical relationFootnote 36 for an equation of state \(\varepsilon =\varepsilon (n,s)\)

$$\begin{aligned} \nabla _a \varepsilon = \mu \nabla _a n + T \nabla _a s = {p+\varepsilon -sT \over n} \nabla _a n + T \nabla _a s , \end{aligned}$$
(15.7)

to show that

$$\begin{aligned} T\varGamma _\mathrm {s}= \left( \beta T -1 \right) \nabla _a q^a + q^a \left( T \nabla _a \beta - u^b \nabla _b u_a \right) . \end{aligned}$$
(15.8)

We want to ensure that the right-hand side of this equation is positive definite (or indefinite). An easy way to achieve this is to make the identification

$$\begin{aligned} \beta = 1/T, \end{aligned}$$
(15.9)

and at the same time insist that the heat flux is such that

$$\begin{aligned} q^a = - \kappa T \perp ^{a b} \left( \frac{1}{T} \nabla _b T + u^c \nabla _c u_b \right) , \end{aligned}$$
(15.10)

with \(\kappa \ge 0\) being the heat conductivity coefficient. This means that

$$\begin{aligned} \varGamma _\mathrm {s}= \frac{q^a q_a}{\kappa T} \ge 0 , \end{aligned}$$
(15.11)

by construction, and the second law of thermodynamics is satisfied.

The energy equation now takes the form

$$\begin{aligned} n T {d x_\mathrm {s}\over d\tau } + \nabla _a q^a + q^a \dot{u}_a = 0 \end{aligned}$$
(15.12)

where \(\dot{u}_a = u^b \nabla _b u_a\) is the four acceleration, as before. We also have the momentum equation

$$\begin{aligned}&\perp ^c_b \nabla _a T^{ab} = 0 \nonumber \\&\quad \Longrightarrow \quad (p+\varepsilon ) \dot{u}^a + \perp ^{ab}\left( \nabla _b p + \dot{q}_b \right) + q^b \nabla _b u^a + q^a \nabla _b u^b = 0 . \end{aligned}$$
(15.13)

This model seems quite generic. Unfortunately, it has some major problems. While it is built to pass the key test set by the second law of thermodynamics, it fails at the next hurdle. A detailed analysis of perturbations away from an equilibrium state (Hiscock and Lindblom 1985) shows that small perturbations tend to be dominated by rapidly growing instabilities (we will demonstrate this later), suggesting that the formulation may be practically useless. From the mathematical point of view it is also not acceptable since, being non-hyperbolic, it does not admit a well-posed initial-value problem. We will discuss how we can fix these problems shortly. First we will take a slight detour towards an application.

15.2 Case study: neutron star cooling

One situation where the model we have derived finds practical use is in the description of the thermal evolution of a maturing neutron star. This is (obviously) an interesting problem in itself, and from the present perspective it is worth clarifying the assumptions that lead to the equations commonly used in cooling simulations. The typical starting points tends to be the assumption that the configuration can be taken to be static, essentially meaning that we ignore the impact of the thermal pressure on the matter and the spacetime. Taking the spacetime to be spherically symmetric and static, we have the usual line element

$$\begin{aligned} ds^2 = - e^{2\nu } dt^2 + e^{2\lambda } dr^2 + r^2 d\theta ^2 + r^2 \sin ^2 \theta d\varphi ^2 , \end{aligned}$$
(15.14)

where \(\nu \) and \(\lambda \) are functions of r, while the matter four velocity is take to be

$$\begin{aligned} u^a = \left[ e^{-\nu }, 0, 0, 0 \right] . \end{aligned}$$
(15.15)

It is important to understand that this does not mean that \(\dot{u}^a = 0\). We still get a contribution from the spacetime curvature. Ignoring the heat flux terms in (15.13) we have (with primes denoting radial derivatives)

$$\begin{aligned} (p+\varepsilon ) \dot{u}^a + \perp ^{ab} \nabla _b p = 0 \quad \Longrightarrow \quad p' = - (p+\varepsilon ) \nu ' \end{aligned}$$
(15.16)

It is worth taking a closer look at this (well-known) equation. Consider the case of a single fluid, for which we have (see Sect. 5.2, noting that we assume the impact of the thermal pressure on the matter configuration can be neglected)

$$\begin{aligned} p+\varepsilon = n\mu , \qquad \text{ and } \qquad \nabla _a p = n \nabla _a \mu \end{aligned}$$
(15.17)

and it follows that (15.16) simply represents the fact energies are affected by the gravitational redshift:

$$\begin{aligned} {d\over dr} \left( \mu e^{\nu } \right) = 0 . \end{aligned}$$
(15.18)

In the situations where \(q^a\ne 0\), we are obviously ignoring the impact of the heat flux on the overall energy and the spacetime curvature. This is likely to be a good approximation in most situations of interest.

Moving on to the equations that govern the thermal component, we first of all find that the radial component of (15.10) becomes

$$\begin{aligned} q^r = - \kappa e^{-2\lambda } \left( T' + {T} \nu ' \right) = - \kappa e^{-2\lambda - \nu } \partial _r \left( T e^{\nu }\right) = - \kappa e^{-2\lambda - \nu } \partial _r \left( T^\infty \right) \end{aligned}$$
(15.19)

where we have defined the temperature measured by an observer at infinity, \(T^\infty \). Finally, we need (15.12). As we want to work with the temperature rather than the entropy, we use

$$\begin{aligned} d\varepsilon = \mu dn + T ds = \left( {\partial \varepsilon \over \partial n} \right) _T dn + \left( {\partial \varepsilon \over \partial T} \right) _n dT . \end{aligned}$$
(15.20)

We also note that, for a static configuration \(\nabla _a u^a = 0 \) so (15.1) means that

$$\begin{aligned} {dn \over d\tau } = 0 , \end{aligned}$$
(15.21)

and we have

$$\begin{aligned} {ds \over d\tau } = {1\over T}\left( {\partial \varepsilon \over \partial T} \right) _n {dT \over d\tau } . \end{aligned}$$
(15.22)

That is, we can write (15.12) as

$$\begin{aligned} \left( {\partial \varepsilon \over \partial T} \right) _n {dT \over d\tau } + \nabla _a q^a + q^b \dot{u}_b = 0 , \end{aligned}$$
(15.23)

which (if we assume that the heat flux is radial) becomes

$$\begin{aligned} \left( {\partial \varepsilon \over \partial T} \right) _n e^{-\nu } \partial _t T + {1\over r^2} e^{-(2\lambda +\nu )} \partial _r \left[ r^2 e^{(\lambda +\nu )} q^r \right] + {\nu ' } q^r = 0 . \end{aligned}$$
(15.24)

In principle we now have the equations we need. We only need to massage them into a more intuitive form. The first step involves introducing the flux through a spherical surface with radius r:

$$\begin{aligned} {L\over 4\pi r^2} = e^{\lambda } q^r = q^{\hat{r}} , \end{aligned}$$
(15.25)

(based on using a tetrad description, see Sect. 14.6). This means that (15.19) becomes

$$\begin{aligned} {L \over 4 \pi \kappa r^2} = - e^{-(\lambda + \nu )} \partial _r \left( T e^{\nu }\right) , \end{aligned}$$
(15.26)

while (15.24) can be written

$$\begin{aligned} C_v e^{-\nu } \partial _t T + {1 \over 4\pi r^2} e^{-\lambda -2\nu } \partial _r \left( e^{2\nu } L \right) = 0 , \end{aligned}$$
(15.27)

where we have identified the heat capacity at fixed volume

$$\begin{aligned} C_v = \left( {\partial \varepsilon \over \partial T} \right) _n. \end{aligned}$$
(15.28)

Once we introduce the energy loss due to (say) the emission of neutrinos, we arrive at the equations discussed in the classic review by Yakovlev and Pethick (2004), which in turn originate from the classic work of Thorne (1977).

15.3 The multi-fluid view

Let us now consider thermal dynamics from a multi-fluid perspective, with the view of comparing to the standard derivation. In order to do this we assume that the entropy component can be treated as a “fluid” (analogous to the thermal excitations of a superfluid system, see Sect. 13). In essence, this implies that the mean free path of the phonons is taken to be small compared to the model scale. We then consider two fluxes, one corresponding to the matter flow and one associated with the entropy. The latter is treated as massless (zero rest-mass). The dynamics then follows from the usual two-fluid Lagrangian, which also depends on the relative flow of the two fluxes. As we will see, the entropy entrainment turns out to be a crucial feature of the model (Andersson and Comer 2010; Lopez-Monsalvo and Andersson 2011).

As in the case of a general two-fluid system, the starting point is the definition of a relativistic invariant Lagrangian \(\varLambda \). Assuming that the system is isotropic, we take \(\varLambda \) to be a function of the different scalars that can be formed by the two fluxes. From the matter current \(n^a\) and the entropy flux \(s^a\) we can form three scalars (tweaking the multifluid notation to stay close to the previous derivation);

$$\begin{aligned} n^2 = -n_a n^a , \quad s^2 = -s_a s^a , \quad j^2 = -n_a s^a . \end{aligned}$$
(15.29)

An unconstrained variation of \(\varLambda \) then leads to

$$\begin{aligned} \delta \varLambda = \frac{\partial \varLambda }{\partial n}\delta n + \frac{\partial \varLambda }{\partial s} \delta s + \frac{\partial \varLambda }{\partial j} \delta j . \end{aligned}$$
(15.30)

Replacing the passive density variations with dynamical variations of the worldlines (as in Sect. 6) we find that

$$\begin{aligned} \delta \varLambda= & {} \left[ -2\frac{\partial \varLambda }{\partial n^2}n_a -\frac{\partial \varLambda }{\partial j^2}s_a \right] \delta n^a+ \left[ -2\frac{\partial \varLambda }{\partial s^2}s_a -\frac{\partial \varLambda }{\partial j^2}n_a \right] \delta s^a \nonumber \\&+ \left[ -\frac{\partial \varLambda }{\partial n^2}n^an^b - \frac{\partial \varLambda }{\partial s^2}s^a s^b - \frac{\partial \varLambda }{\partial j^2}n^a s^b\right] \delta g_{ab} . \end{aligned}$$
(15.31)

From this we can read off the conjugate momentum associated with each of the fluxes;

$$\begin{aligned} \mu _a=\frac{\partial \varLambda }{\partial n^a} = g_{ab}(\mathcal {B}^\mathrm {n}n^b + \mathcal {A}_{\mathrm{ns}}s^b) , \quad \theta _a=\frac{\partial \varLambda }{\partial s^a} = g_{ab}(\mathcal {B}^\mathrm {s}s^b + \mathcal {A}_{\mathrm{ns}}n^b) , \end{aligned}$$
(15.32)

where

$$\begin{aligned} \mathcal {B}^\mathrm {n}\equiv -2 \frac{\partial \varLambda }{\partial n^2}, \quad \mathcal {B}^\mathrm {s}\equiv -2 \frac{\partial \varLambda }{\partial s^2}, \quad \mathcal {A}^{\mathrm {n}\mathrm {s}}\equiv -\frac{\partial \varLambda }{\partial j^2} . \end{aligned}$$
(15.33)

As usual, the stress-energy tensor is obtained by noting that the displacements of the conserved currents induce a variation in the spacetime metric. In this case, we arrive at

$$\begin{aligned} T_a^{\ b} = \mu _a n^b + \theta _a s^b + \varPsi \delta _a^{\ b} , \end{aligned}$$
(15.34)

where we have defined the generalized pressure, \(\varPsi \), as

$$\begin{aligned} \varPsi = \varLambda -\mu _a n^a - \theta _a s^a . \end{aligned}$$
(15.35)

These results are completely analogous to the two-fluid model from Sect. 9.

As the divergence of the stress-energy tensor (15.34) vanishes, we can express the equations of motion as a force balance

$$\begin{aligned} \nabla _b T_{a}^{\ b} = f^\mathrm {n}_a + f^\mathrm {s}_a = 0 , \end{aligned}$$
(15.36)

where the individual force densities are

$$\begin{aligned} f^\mathrm {n}_a&=2 n^b\nabla _{[b}\mu _{a]} + \mu _a \nabla _b n^b , \end{aligned}$$
(15.37)
$$\begin{aligned} f^\mathrm {s}_a&=2 s^b \nabla _{[b}\theta _{a]}+ \theta _a \nabla _b s^b . \end{aligned}$$
(15.38)

Note that, in order to obtain the stress-energy tensor (15.34), as in Sect. 4, we needed to impose the conservation of the fluxes as constraints on the variation. However, the equations of motion, (15.37) and (15.38), still allow for non-vanishing production terms. If we, for simplicity, consider a single particle species, the matter current is conserved (there can be no particle reactions) and we have \(\nabla _a n^a = 0\). This removes the second term from the right-hand side of (15.37). In contrast, the entropy flux is generally not conserved, but in accordance with the second law we must have

$$\begin{aligned} \nabla _a s^a = \varGamma _\mathrm {s}\ge 0 . \end{aligned}$$
(15.39)

So far, the model is fairly general. To progress, we need to connect with thermodynamics. In doing this it makes sense to consider a specific choice of frame. In the context of a single (conserved) species of matter, we see that the force \(f^\mathrm {n}_a\) is orthogonal to the matter flux, \(n^a\), and therefore it has only three degrees of freedom. Furthermore, because of the force balance (15.36), we also have \(n^a f^\mathrm {s}_a=0\). This suggests that it is natural to focus on observers associated with the matter frame. We therefore introduce the four-velocity \(u^a\) such that \(n^a = n u^a\), where \(u_a u^a = -1\) and n is the number density measured in this frame. This is, of course, the same frame as in Sect. 6.

Having chosen to work in the matter frame, we can decompose the entropy current and its conjugate momentum into parallel and orthogonal components. The entropy flux is then expressed as

$$\begin{aligned} s^a = s^* (u^a + w^a) , \end{aligned}$$
(15.40)

where \(w^a\) is the relative velocity between the two fluid frames, and \(u^a w_a=0\). Letting \(s^a = s u^a_\mathrm {s}\) where \(u_\mathrm {s}^a\) is the four-velocity associated with the entropy flux, we see that \(s^*=s\gamma \) where \(\gamma \) is the redshift associated with the relative motion of the two frames.Footnote 37

Similarly, we can write the thermal momentum as

$$\begin{aligned} \theta _a = \left( \mathcal {B}^\mathrm {s}s^* + \mathcal {A}_{\mathrm{ns}}n\right) u_a + \mathcal {B}^\mathrm {s}s^* w_a . \end{aligned}$$
(15.41)

This leads to a measure of the temperature measured in the matter frame:

$$\begin{aligned} -u^a \theta _a = \theta ^* = \mathcal {B}^\mathrm {s}s^* + \mathcal {A}_{\mathrm{ns}}n . \end{aligned}$$
(15.42)

In essence, this quantity represents the effective mass of the entropy component. Returning to the stress-energy tensor, and making use of the projection orthogonal to the matter flux, we find that the heat flux (energy flow relative to the matter) is given by

$$\begin{aligned} q_a = -\perp _{ab}u_c T^{bc} = s^* \theta ^* w_a . \end{aligned}$$
(15.43)

Defining the new variables \(\sigma ^a = s^* w^a\) and \(p_a = \mathcal {B}^\mathrm {s}s^* w_a\), the energy density measured in the matter frame can be obtained by a Legendre transform on the Lagrangian. We have

$$\begin{aligned} \varepsilon ^* = u_a u_b T^{ab} = - \varLambda + p_a \sigma ^a . \end{aligned}$$
(15.44)

The relevance of the new variables becomes apparent if we consider the fact that the dynamical temperature in (15.42) agrees with the thermodynamical temperature that an observer moving with the matter would measure. In other words, we have

$$\begin{aligned} \theta ^* = \left. {\partial \varepsilon ^* \over \partial s^*}\right| _{n,p} , \end{aligned}$$
(15.45)

where \(\varepsilon ^* = \varepsilon ^*(n,s^*,p)\). This is the standard definition of temperature as the energy per degree of freedom of the system. Formally, the temperature is obtained from the variation of the energy with respect to the entropy in the observer’s frame (keeping the other thermodynamic variables fixed).

This result is not trivial. The requirement that the two temperature measures agree determines the additional state parameter, p, to be held constant in the variation of \(\varepsilon ^*\). The importance of the chosen state variables is emphasized further if we note that, when the system is out of equilibrium, the energy depends on the heat flux (encoded in \(\sigma ^a\) and \(p_a\)). This leads to an extended Gibbs relation (similar to that postulated in many approaches to extended thermodynamics; Jou et al. 1993);

$$\begin{aligned} d \varepsilon ^* = \mu d n + \theta ^* d s^* + \sigma d p . \end{aligned}$$
(15.46)

This result arises naturally from the variational analysis. It is derived rather than assumed.

Traditionally, thermodynamic properties like pressure and temperature are uniquely defined only in equilibrium. Intuitively this makes sense since—in order to carry out a measurement—the measuring device must have time to reach “equilibrium” with the system. A measurement is only meaningful as long as the timescale required to obtain a result is shorter than the evolution time for the system. However, this does not prevent a generalization of the various thermodynamic concepts (as described above). The procedure may not be “unique”, but one must at least require the generalized concepts to be internally consistent.

The variational model encodes the finite propagation speed for heat, as required by causality. To demonstrate this, we may use the orthogonality of the entropy force density \(f_\mathrm {s}^a\) with the matter flux, solve for the entropy production rate \(\varGamma _\mathrm {s}\) and then impose the second law of thermodynamics. It is natural to express the result in terms of the heat flux \(q^a\), now given by

$$\begin{aligned} s^a= s^*u^a + \frac{1}{\theta ^*}q^a . \end{aligned}$$
(15.47)

Meanwhile, the conjugate momentum takes the form

$$\begin{aligned} \theta _a = \theta ^* u_a + \beta q_a , \end{aligned}$$
(15.48)

where

$$\begin{aligned} \beta = \frac{1}{s^*} - \frac{\mathcal {A}_{\mathrm{ns}}n }{s^* \theta ^*} . \end{aligned}$$
(15.49)

With these definitions, we impose the second law of thermodynamics by demanding that the entropy production is a quadratic in the sources, i.e.,

$$\begin{aligned} \varGamma _\mathrm {s}= {q^2 \over \kappa \theta _*^2} \ge 0 , \end{aligned}$$
(15.50)

where \(\kappa >0\) is the thermal conductivity. This means that the heat flux is governed by

$$\begin{aligned} \tau \left( \dot{q}^a + q_c \nabla ^a u^c \right) + q^a = -\tilde{\kappa } \perp ^{ab}\left( \nabla _b \theta ^* + \theta ^* \dot{u}_b\right) , \end{aligned}$$
(15.51)

where \(\dot{q}^a = u^b \nabla _b q^a\) and \(\dot{u}^a\) is the four-acceleration (as before) and we have also introduced

$$\begin{aligned} \tilde{\kappa } \equiv \frac{\kappa }{1 + \kappa \dot{\beta }} , \end{aligned}$$
(15.52)

while the thermal relaxation time is given by

$$\begin{aligned} \tau = \frac{\kappa \beta }{1 + \kappa \dot{\beta }} . \end{aligned}$$
(15.53)

The final result (15.51) is the relativistic version of the so-called Cattaneo equation (Cattaneo 1948; Andersson and Comer 2010; Lopez-Monsalvo and Andersson 2011). It resolves the issue of the instantaneous propagation of heat, see Jou and Casas-Vazquez (1988) for a brief discussion. We also learn that the entropy entrainment, encoded in \(\mathcal {A}_{\mathrm{ns}}\), plays a key role in determining the thermal relaxation time \(\tau \). This agrees with the implications of extended thermodynamics, as well as related results in the context of Newtonian gravity (Andersson and Comer 2010). Finally, as described by Jou et al. (1993), the Cattaneo equation inspired the development of the more general extended irreversible thermodynamics framework.

The heat problem (obviously) has two dynamical degrees of freedom, leading to the presence of a second sound in solids, an effect that has been observed in laboratory experiments on dielectric crystals (Ruggeri et al. 1996). So far, we focussed on the heat. In addition, we have a momentum equation for the matter component. From (15.37) it follows that this equation can be written

$$\begin{aligned} \mu \dot{u}_a+\perp ^b_a\nabla _b \mu +\alpha \dot{q}_a +\dot{\alpha } q_a +\alpha q^b \nabla _a u_b = {1 \over n} f^\mathrm {n}_a . \end{aligned}$$
(15.54)

Here we have represented the matter momentum by

$$\begin{aligned} \mu _a = \mu u_a + \alpha q_a , \end{aligned}$$
(15.55)

where \(\mu \) is the chemical potential (in the matter frame) and

$$\begin{aligned} \alpha = {\mathcal {A}_{\mathrm{ns}}\over \theta ^*} . \end{aligned}$$
(15.56)

That is, we have

$$\begin{aligned} \alpha = {1 - \beta s^* \over n} . \end{aligned}$$
(15.57)

Given these definitions, we have

$$\begin{aligned} -f^\mathrm {n}_a = f^\mathrm {s}_a = - {1 \over \tilde{\kappa }}\left( s^* - {\beta q^2 \over \theta _*^2} \right) q_a . \end{aligned}$$
(15.58)

It is useful to note that this implies that the force has a term that is linear in \(q^a\). We will explore this fact in the following.

figure y

The two-fluid results can be directly compared to the “phonon hydrodynamics” model developed by Guyer and Krumhansl (1966) (see Llebot et al. 1983; Cimmelli 2007 for alternatives). This may be the most celebrated attempt to account for non-local heat conduction effects, accounting for the interaction of phonons with each other and the conducting lattice. The usefulness of this result is due to the fact that it can be used both in the collision dominated and the ballistic phonon regime. In the former, the resistivity dominates, the nonlocal terms can be neglected and heat propagates as waves. In the opposite regime, the momentum conserving interactions are dominant and we can neglect the thermal relaxation. In this regime, heat propagates by diffusion. The transition between these two extremes has recently been discussed by Vázquez and Márkus (2009).

Interestingly, the non-local heat conduction model may also be useful for nano-size systems. If a system has characteristic size smaller than the relevant mean-free path then one would not necessarily expect a fluid model to apply. Nevertheless, Alvarez et al. (2009) have argued that the expected behaviour of the thermal conductivity as the size of the system decreases (as discussed by Alvarez and Jou 2007) can be reproduced provided that an appropriate slip condition for \(q^a\) is applied at the boundaries. This is an interesting problem that deserves further study.

Finally, it is worth commenting on dissenting perspectives. The main issue appears to stem from the presence of the term involving the four-acceleration on the right-hand side of (15.51). We have already seen that this term encodes the impact of the gravitational redshift on the temperature, which obviously has no counterpart in the Newtonian problem. Dynamically, the effect results from the fact that the infinitesimal 3-spaces orthogonal to the matter world lines are not parallel, but “tipped over” because of the curvature of the world line. This leads to the interpretation of the four-acceleration contribution in terms of the effective inertia of heat (Ehlers 1973). This seems quite intuitive, but it has nevertheless been suggested (García-Colín and Sandoval-Villalbazo 2006; García-Perciante et al. 2009b; Tsumura and Kunihiro 2008; Sandoval-Villalbazo et al. 2009) that this term causes instabilites and it should not be included. As this seems somewhat inconsistent, we will not analyse this suggestion in detail.

15.4 A linear model and the second sound

The variational model contains terms that enter as second order deviations from thermal equilibrium, e.g., pieces that are second order in the heat flux, \(q^a\). In fact, it is clear that key effects (like the entropy entrainment) arise from the presence of such terms in the Lagrangian. Having said that, once we have written down the general model, we can opt to truncate the results at first order. Crucially, this does not take us back to the original first-order model. The thermal relaxation remains, reflecting the simple fact that you need to know the energy of a system to quadratic order in order to develop the complete linear equations of motion. Noting this, it is interesting to consider the features of this new first-order model. After all, this, much simpler, description may be adequate in many relevant situations.

We want to restrict our analysis to first order deviations from equilibrium. Thermal equilibrium corresponds to \(q^a=0\), no heat flux, and \(\dot{u}^a=0\), no matter acceleration (in essence, we are analyzing the problem at the local level, ignoring gravity). Moreover, in the simplest cases there should be no shear, divergence or vorticity associated with the flow, i.e., we have \(\nabla _a u^a = 0\) and \(\nabla _b u^a=0\) as well. Treating all these quantities as first order, and noting that

$$\begin{aligned} u_b \dot{q}^b = - q^b \dot{u}_b , \end{aligned}$$
(15.59)

also contributes at second order, we arrive at two momentum equations; from (15.54) we have

$$\begin{aligned} \mu \dot{u}_a + \perp ^b_{\ a} \nabla _b \mu + \alpha \dot{q}_a + \left( \dot{\alpha } - {s \over n\tilde{\kappa }} \right) q_a = 0 , \end{aligned}$$
(15.60)

while (15.51) leads to

$$\begin{aligned} \tau \dot{q}_a + q_a + \tilde{\kappa } \left( \perp ^b_{\ a} \nabla _b T + T \dot{u}_a \right) = 0 . \end{aligned}$$
(15.61)

We also have the two conservation laws

$$\begin{aligned} \nabla _a n^a= & {} 0 , \end{aligned}$$
(15.62)
$$\begin{aligned} \nabla _a s^a= & {} 0 , \end{aligned}$$
(15.63)

noting that \(\varGamma _\mathrm {s}\) is second order (by construction). In these equations we have used the fact that \(s^*\) and \(\theta ^*\) differ from the equilibrium values s and T only at second order. To first order, the pressure p is obtained from the standard equilibrium Gibbs relation

$$\begin{aligned} \nabla _a p = n \nabla _a \mu + s \nabla _a T . \end{aligned}$$
(15.64)

Finally, we have the fundamental relation

$$\begin{aligned} \varepsilon + p = \mu n + s T . \end{aligned}$$
(15.65)

By comparing (15.60) and (15.61) to the Eckart frame results it becomes apparent to what extent the first-order model relies on its higher order origins. Specifically, \(\alpha \) and (therefore) \(\tau \) depend on \(\mathcal {A}_{\mathrm{ns}}\) and the entropy entrainment, c.f., (15.56). These effects rely on quadratic terms in the Lagrangian, and hence would not be present in a model that includes only first order terms from the start.

In order to analyze the dynamics of the heat problem, we consider perturbations (represented by \(\delta \)) away from a uniform equilibrium state. First of all, recall that we have \(q_a=\dot{u}_a=0\) for a system in equilibrium. We can also ignore \(\dot{\alpha }\) and \(\dot{\beta }\), since the equilibrium configuration is uniform, which means that we can replace \(\tilde{\kappa }\) by \(\kappa \). This means that we are left with two equations;

$$\begin{aligned} \mu \delta \dot{u}_a + \perp ^b_{\ a} \nabla _b \delta \mu + \alpha \delta \dot{q}_a-{s \over n\kappa } \delta q_a= 0 , \end{aligned}$$
(15.66)

and

$$\begin{aligned} \tau \delta \dot{q}_a + \delta q_a + \kappa \perp ^b_{\ a} \nabla _b \delta T + \kappa T \delta \dot{u}_a = 0 , \end{aligned}$$
(15.67)

We can combine these to get

$$\begin{aligned} \left( p+\varepsilon \right) \delta \dot{u}_a + \perp _a^b \nabla _b \delta p + \delta \dot{q}_a = 0 . \end{aligned}$$
(15.68)

The last two equations [(15.67) and (15.68)] are, not surprisingly, identical to the first-order reduction of the Israel-Stewart model (see Sect. 16), so the problem is relatively well explored. In particular, the conditions required for stability and causality were derived by Hiscock and Lindblom (1983, 1987), see also Olson and Hiscock (1990).

Working in the frame associated with the background flow, we note that (15.66) and (15.67) only have spatial components. That is, we may erect a local Cartesian coordinate system associated with the matter frame and simply replace \(a\rightarrow i\) where \(i=1, 2, 3\). Then taking the curl (\(\epsilon ^{jki} \nabla _k\)) of the equations in the usual way, we arrive at

$$\begin{aligned} m_\star \dot{U}^i - {1 \over \tau } \dot{Q}^i = 0 , \end{aligned}$$
(15.69)

and

$$\begin{aligned} m_\star \dot{Q}^i +(p+\varepsilon ) Q^i = 0 , \end{aligned}$$
(15.70)

where we have defined

$$\begin{aligned} U^i = \epsilon ^{ijk} \nabla _j \delta u_k , \qquad \text{ and } \qquad Q^i = \epsilon ^{ijk} \nabla _j \delta q_k , \end{aligned}$$
(15.71)

and

$$\begin{aligned} m_\star = n \left( \mu - { \alpha \kappa T \over \tau } \right) = p+ \varepsilon - {\kappa T \over \tau } . \end{aligned}$$
(15.72)

Assuming that the perturbations depend on time as \(e^{i\omega t}\), where t is the time-coordinate associated with the matter frame, we arrive at the dispersion relation for transverse perturbations;

$$\begin{aligned} i\omega \left[ (p+\varepsilon )( 1 + i\omega \tau ) - i \omega \kappa T \right] = 0 . \end{aligned}$$
(15.73)

Obviously \(\omega = 0\) is a solution. The second root is

$$\begin{aligned} \omega = {i(p+\varepsilon ) \over m_\star \tau } . \end{aligned}$$
(15.74)

This result shows that the thermal relaxation time \(\tau \) is essential in order for the system to be stable. We need \(m_\star >0\), i.e., the relaxation time must be such that

$$\begin{aligned} \tau > {\kappa T \over p+\varepsilon } . \end{aligned}$$
(15.75)

The analysis demonstrates why the Eckart model (for which \(\tau =0\)) is inherently unstable. Moreover, the constraint on the relaxation time agrees with one of the conditions obtained by Olson and Hiscock (1990) (cf. their Eq. (41)), representing the inviscid limit of the exhaustive analysis of the Israel–Stewart model of Hiscock and Lindblom (1983). We also note that the condition given in Eq. (43) of Olson and Hiscock (1990) simply leads to the weaker requirement \(\tau \ge 0\).

The problem of transverse oscillations is fairly simple since there are no corresponding restoring forces in a pure fluid problem (these requires rotation, elasticity, the presence of a magnetic field etcetera). The physical origin of the instability becomes clear once we note that \(m_*\) plays the role of an “effective” inertial mass (density). The importance of this quantity has been discussed in work by Herrera et al. (1997, 2002) and Herrera and Santos (1997), especially in the context of gravitational collapse. Basically, the instability of the Eckart formulation is due to the inertial mass of the fluid becoming negative. Once this happens the pressure gradient no longer provides a restoring force, rather it tends to push the system further away from equilibrium. This is a run-away process, associated with exponential growth of perturbations. Ultimately, the instability is due to the inertia of heat; an unavoidable consequence of the equivalence principle (heat carries energy, which means that it can be associated with an effective mass; Tolman 1987). The condition (15.75) may seem rather extreme (Hiscock and Lindblom 1987 quote a timescale of \(10^{-35}\) s for water at 300 K), but it sets a sharp lower limit for the thermal relaxation in physical systems. A system with faster thermal relaxation can not settle down to equilibrium. However, it may still be reasonable to ask if a system may evolve in such a way that it enters the unstable regime (in the way discussed by Herrera et al. 1997; Herrera and Santos 1997).

When we turn to the longitudinal case the situation changes. In a perfect fluid longitudinal perturbations propagate as sound waves, and when we add complexity to the model the dispersion relation soon gets complicated. The problem has been discussed in detail by Lopez-Monsalvo and Andersson (2011), so we will move straight to the results. The dispersion relation for the phase velocity, \(\sigma = \omega /k\), is

$$\begin{aligned}&m_\star \tau \sigma ^4 - {i(p+\varepsilon ) \over k} \sigma (\sigma ^2 - C_s^2) - \left[ (p+\varepsilon )\left( {\kappa \over n c_v} + C_s^2 \tau \right) -2\kappa T \alpha _s\right] \sigma ^2 \nonumber \\&\quad + \, \kappa \left[ {p+\varepsilon \over n} {C_s^2 \over c_{v}} - T\alpha _s^2\right] = 0 , \end{aligned}$$
(15.76)

where have introduced (i) the sound speed

$$\begin{aligned} C_s^2 = \left( {\partial p \over \partial \varepsilon } \right) _{\bar{s}} = {n\over p+\varepsilon } \left( {\partial p \over \partial n } \right) _{\bar{s}} , \end{aligned}$$
(15.77)

(ii) the specific heat at fixed volume

$$\begin{aligned} c_v = {C_v\over n} = T \left( {\partial \bar{s} \over \partial T} \right) _{n} = {1\over n} \left( {\partial \varepsilon \over \partial T} \right) _{n} , \end{aligned}$$
(15.78)

and (iii)

$$\begin{aligned} \alpha _s = {n\over T} \left( {\partial T \over \partial n} \right) _{\bar{s}} = {T\over n} \left( {\partial p \over \partial \bar{s}} \right) _{n} = T \left( {\partial p \over \partial s} \right) _{n} . \end{aligned}$$
(15.79)

For future reference, it is also useful to note the identity [cf. Eq. (96) in Hiscock and Lindblom (1983)]

$$\begin{aligned} {1\over c_v} - {1\over c_p} = {n^3 \over T (p+\varepsilon ) C_s^2 } \left( {\partial T \over \partial n} \right) _{\bar{s}}^2 = {nT \over (p+\varepsilon ) C_s^2} \alpha _s^2 , \end{aligned}$$
(15.80)

where \(c_p\) is the specific heat at fixed pressure.

The dispersion relation (15.76) is too complicated for us to be able to make definite statements about the solutions, but we can simplify the analysis by considering the long- and short-wavelength limits. The results we obtain in these limits illustrate the key features. At the same time, we should keep in mind that both cases are somewhat “artificial”. First of all, fluid dynamics is, fundamentally, an effective long-wavelength theory in the sense that it arises from an averaging over a large number of individual particles (constituting each fluid element). In effect, the model only applies to phenomena on scales much larger than (say) the interparticle distance. However, the infinite wavelength limit represents a uniform system, which is artificial since real physical systems tend to be finite. Moreover, as we will not account explicitly for gravity we can only consider scales on which spacetime can be considered flat. While the plane-wave analysis holds on arbitrary scales in special relativity, a curved spacetime introduces a cut-off lengthscale beyond which the analysis is not valid (roughly, the size of a local inertial frame).

Let us first consider the long wavelength, \(k\rightarrow 0\), problem. This represents the true hydrodynamic limit, and it easy to see that there are two sound-wave solutions and two modes that are predominantly diffusive. The sound-wave solutions take the form

$$\begin{aligned} \sigma \approx \pm C_s \left[ 1 \pm i {\kappa T \over 2(p+\varepsilon ) C_s^3} (C_s^2 - \alpha _s)^2 k \right] . \end{aligned}$$
(15.81)

These solutions are clearly stable, since Im \(\sigma >0\). Using the Maxwell relations listed by Hiscock and Lindblom (1983), we can show that this results agrees with Eq. (40) from Hiscock and Lindblom (1987). Moreover, our result simplifies to [using (15.80)]

$$\begin{aligned} \mathrm {Im}~\sigma \approx {\kappa \over 2n} \left( {1 \over c_v} - {1 \over c_p} \right) , \end{aligned}$$
(15.82)

in the limit where \(|\alpha _s|\gg C_s^2\), which is relevant since \(C_s^2 \sim p/\rho \) becomes small in the non-relativistic limit. Indeed, we find that (15.82) agrees with the standard result for sound absorption in a heat-conducting medium (Mountain 1966).

In addition to the sound waves, we have a slowly damped solution

$$\begin{aligned} \sigma \approx i\kappa \left[ {1 \over n c_v} - {T \alpha _s^2 \over (p+\varepsilon ) C_s^2 } \right] = {i\kappa \over n c_p} . \end{aligned}$$
(15.83)

This is the classic result for thermal diffusion. Finally, the system has a fast decaying solution;

$$\begin{aligned} \sigma \approx {i (p+\varepsilon ) \over m_\star k \tau } . \end{aligned}$$
(15.84)

Under most circumstances, this root decays too fast to be observable, so the model reproduces that standard “Rayleigh–Brillouin spectrum” with two sound peaks symmetrically placed with respect to the broad diffusion peak at zero frequency (Mountain 1966; Garcia-Perciante et al. 2009a)

The short wavelength limit probes different aspects of the problem. Letting \(k\rightarrow \infty \) we see that (15.76) reduces to a quadratic for \(\sigma ^2\). We have

$$\begin{aligned} A\sigma ^4 - B \sigma ^2 + C = 0 , \end{aligned}$$
(15.85)

with

$$\begin{aligned} A = m_\star \tau > 0 , \end{aligned}$$
(15.86)

(as required for stability)

$$\begin{aligned} B= (p+\varepsilon ) \left( {\kappa \over nc_v} + C_s^2 \tau \right) - 2\kappa T\alpha _s , \end{aligned}$$
(15.87)

and

$$\begin{aligned} C= \kappa \left( {p+\varepsilon \over n} {C_s^2 \over c_v}- T \alpha _s^2 \right) = \kappa {p+\varepsilon \over n} {C_s^2 \over c_p} > 0 . \end{aligned}$$
(15.88)

This allows us to write down the solutions in closed form and it is relatively straightforward to establish the conditions required for the stability of the system in this limit. The analysis is a bit messy but at the same time instructive as it demonstrates how the physics impacts on the mathematics. Moreover, the discussion allows us to make direct contact with many previous efforts to understand the problem.

In essence, we arrive at two conditions. First of all, \(\sigma ^2\) is real and positive as long as \(B^2-4AC>0\), which leads to

$$\begin{aligned}&\left( C_s^2 \tau - {\kappa \over n c_v} -{2\kappa T \alpha _s \over p+\varepsilon } \right) ^2 + {4\kappa T \alpha _s^2 \over p+\varepsilon } \left( \tau - {\kappa T \over p+\varepsilon } \right) \nonumber \\&\quad + {4\kappa ^2 T \over (p+\varepsilon ) n c_v} \left( C_s^2-2\alpha _s\right) > 0 . \end{aligned}$$
(15.89)

The first two terms are positive, as long as (15.75) is satisfied. Hence, the condition is guaranteed to be satisfied as long as \(C_s^2> 2\alpha _s\). In situations where this condition is not satisfied, (15.89) provides a (complicated) constraint on the relaxation time. We must also have \(B> 0\), which leads to

$$\begin{aligned} \tau > {\kappa \over C_s^2} \left[ {2T \over p+\varepsilon } \alpha _s - { 1 \over n c_v} \right] . \end{aligned}$$
(15.90)

This condition is identical to that given in Eq. (146) of Hiscock and Lindblom (1983) (obtained in the limit where \(\alpha _i\rightarrow 0\) and \(1/\beta _0\) and \(1/\beta _2\) both also vanish, cf. Herrera 2006; Maartens 1996).

Let us move on to finite wavelengths. Letting \(\sigma = \sigma _\pm + \sigma _1/k\), where \(\sigma _\pm \) solve (15.85), and linearising in 1/k, we find that

$$\begin{aligned} \sigma _1 = {i(p+\varepsilon ) \over 2} \left( {\sigma _\pm ^2 - C_s^2 \over 2A \sigma _\pm ^2 - B}\right) . \end{aligned}$$
(15.91)

Since all quantities in this expression are already constrained to be real, we need \(\mathrm {Im}\ \sigma _1 \ge 0\) (for real k) in order for the system to be stable. From (15.85) we then have that

$$\begin{aligned} 2A \sigma _\pm ^2 - B = \pm \left| B^2 - 4AC \right| ^{1/2} , \end{aligned}$$
(15.92)

which leads to the condition

$$\begin{aligned} \sigma _-^2 \le C_s^2 \le \sigma _+^2 . \end{aligned}$$
(15.93)

This is notably consistent with the notion that “mode-mergers” signal the onset of instability, see Sect. 7.4.

As the waves in the system must remain causal, we must also insist that \(\sigma ^2< 1\). To ensure that this is the case, we adapt the strategy used by Hiscock and Lindblom (1983). As (15.85) is a quadratic for \(\sigma ^2\) we can ensure that the roots are confined to the interval \(0<\sigma ^2 < 1\) (noting first of all that the roots are real since (15.89) is satisfied). Given that B and C are both positive, the roots must be such that \(\sigma ^2>0\). Meanwhile, we can constrain the roots to \(\sigma ^2 <1\) by insisting that

$$\begin{aligned} A-B+C > 0 , \end{aligned}$$
(15.94)

and

$$\begin{aligned} A-2B > 0 . \end{aligned}$$
(15.95)

Combining these inequalities with the positive discriminant, we can show that \(A> B/2> C\). The first of the two conditions can be written

$$\begin{aligned} (1-C_s^2) \left[ \tau - {\kappa \over n c_v} \right]> {\kappa T (1-\alpha _s)^2 \over p+\varepsilon }> 0 . \end{aligned}$$
(15.96)

Next, when combined with causality the condition (15.93) requires that \(C_s^2 \le \sigma _+^2 < 1\). In other words, we must have \(C_s^2< 1\), which means that (15.96) implies that

$$\begin{aligned} \tau > {\kappa \over n c_v} . \end{aligned}$$
(15.97)

Comparing to the results of Hiscock and Lindblom (1983), we recognize (15.96) as their \(\varOmega _3>0\) condition (it is also Eq. (4) of Herrera and Santos 1997), while (15.97) corresponds to \(\varOmega _6>0\).

Meanwhile, the condition (15.95) can be written

$$\begin{aligned} (2-C_s^2) \tau > {\kappa \over n c_v} + {2\kappa T \over p+\varepsilon } (1-\alpha _s) , \end{aligned}$$
(15.98)

corresponding to eq. (148) of Hiscock and Lindblom. Finally, \(A>C\) leads to

$$\begin{aligned} \tau > {\kappa T \over p+\varepsilon } + {\kappa C_s^2 \over n c_p} . \end{aligned}$$
(15.99)

This corresponds to Eq. (3) in Herrera and Santos (1997), which derives from Eq. (147) of Hiscock and Lindblom (1983). This completes the analysis of the stability and causality of the system. We have arrived at a set of conditions on the thermal relaxation time (and related them to the relevant literature). As long as these conditions are satisfied, the solutions to the problem should be well behaved.

To complete the analysis, let us briefly consider the nature of the solutions. Since the phase velocity \(\sigma \) is obtained from a quartic, we know that the problem has two (wave) degrees of freedom. This accords with the experience from superfluid systems and experimental evidence for heat propagating as waves in low temperature solids. One of the solutions should be associated with the usual “acoustic” sound while the second degree of freedom will lead to a “second sound” for heat. It is instructive to demonstrate how these features emerge within our model.

In order to explore the issue, it is natural to consider the large relaxation time limit. Taking the relaxation time \(\tau \) to be long, the solutions to (15.85) take the form (up to, and including, order \(1/\tau \) terms)

$$\begin{aligned} \sigma _+^2 \approx C_s^2\left[ 1 + {\kappa T \over (p+\varepsilon ) \tau } \left( 1 + {\alpha _s^2 \over C_s^4} \right) \right] , \end{aligned}$$
(15.100)

which could be rewritten using (15.80), and

$$\begin{aligned} \sigma _-^2 \approx {\kappa \over n \tau c_p} . \end{aligned}$$
(15.101)

The first of these solutions clearly represents the usual sound, while the other solution provides the second sound. In the latter case, the deduced speed is exactly what one would expect (Jou et al. 1993). It is easy to see that the first root will satisfy (15.93), and the associated roots will be unstable in the long relaxation time limit. Moreover, the second solution leads to stable roots as long as

$$\begin{aligned} \tau \ge { \kappa \over n c_p C_s^2} . \end{aligned}$$
(15.102)

Basically, the finite wavelength condition implies that the second sound must propagate slower than the first sound. This is, indeed, what is measured in physical systems (like superfluid Helium). Moreover, it is easy to see that this condition must be satisfied in order for the long relaxation time approximation to be valid. The general behaviour is illustrated in Fig. 16, which relates to degenerate matter. We see that the ordinary sound exists at all wavelengths. Meanwhile, at short long wavelengths (small k) the remaining two roots are exponentially damped, i.e. diffusive in character. One root has a relatively slow decay, corresponding to the expected thermal diffusion, while the other root decays so rapidly that it is unlikely to be observable by experiment. Below a critical lengthscale (corresponding to \(k=10\) in Fig. 16) the second sound emerges as a result of the finite thermal relaxation time \(\tau \). For very short lengthscales, heat signals will propagate as waves. However, as is evident, these solutions are always damped. In order to “propagate”, the real part of the wave frequency must exceed the imaginary part (so that several cycles are executed before the motion is damped out). This conclusion is interesting if we consider systems that become superfluid. Suppose we consider a system which starts out in the diffusive regime (e.g., helium above the superfluid transition temperature). When the system is cooled down through the relevant transition temperature, (non-momentum conserving) particle collisions are suppressed. In effect, the critical value of k decreases and the system may enter the regime where the second sound can propagate on macroscopic scales. The second sound emerges in a natural way.

Fig. 16
figure 16

Image reproduced with permission from Andersson and Lopez-Monsalvo (2011), copyright by IOP

An illustration of the qualitative nature of the behaviour of heat conducting degenerate matter, based on the first-order relativistic model. The parameters have been chosen in such a way that the speed of sound is 10% of the speed of light, while the second sound (at short wavelengths, large k) propagate at \(1/\sqrt{3}\) of this. The phase velocity of the waves is \(\sigma =\mathrm {Re}\ \omega /k\) (left panel).The thermal relaxation time \(\tau \) has been chosen such that the critical wavenumber at which the second sound emerges is \(k=10\). At lengthscales larger than this, the corresponding roots are diffusive (have purely imaginary frequency), and in the very long wavelength limit (\(k\rightarrow 0\)) we retain the expected thermal diffusion. The damping time follows from \(1/\mathrm {Im}\ \omega \) (right panel).We also indicate the noncausal region (grey area). The illustrated example is clearly both stable and causal

16 Modelling dissipation

Although the inviscid model provides a natural starting point for any investigation of the dynamics of a fluid system, the effects of dissipation are often essential for the construction of a realistic model. Consider, for example, the case of neutron star oscillations and possible instabilities. While it is interesting from the conceptual point of view to establish that an instability (such as the gravitational-wave driven instability of the fundamental f-mode or the inertial r-mode discussed in Sect. 7.4) may be present in an ideal fluid, it is crucial to establish that the instability is able to grow on a reasonably short timescale. To establish this, one must consider the most important damping mechanisms and work out whether or not they will suppress the instability. A discussion of these issues in the context of the r-mode instability can be found in Andersson (2003) and chapter 15 of Andersson (2019).

As we have already seen for the particular case of heat flow, dissipation in a relativistic system raises difficult issues. According to the established consensus view, one must account for second-order deviations from thermal equilibrium in order to guarantee causality and stability. This is certainly the lesson from the celebrated work of Israel and Stewart (1979a, 1979b), see Denicol et al. (2010) and Betz et al. (2011, 2009) for more recent work on the problem. We have already introduced the main points in the context of heat conduction, taking a multi-fluid prescription based on the variational formulation as our starting point. This approach has the flexibility required to account for the physics that we need to consider. A particularly appealing feature of the variational approach is that, once an “equation of state” for matter is provided, the theory provides the relation between the various currents and their conjugate momenta. As we have seen, this leads to a model which has the key elements required for causality and stability, and clarifies the role of the inertia of heat (e.g., the effective mass associated with phonons). Moreover, as demonstrated by Priou (1991) some time ago, the variational model is formally equivalent to the Israel–Stewart construction. At the end of the day, the theoretical framework becomes rather intuitive and the physics involved seems natural.

Does this mean that no issues remain in this problem area? Not really. First of all, it is clear that the need to introduce additional parameters (e.g., the relevant relaxation times) and keep track of higher order terms (fluxes of fluxes and so on) make applications complex. Secondly, we are not much closer to considering systems that deviate significantly from equilibrium, for which there is no natural “small” parameter to expand in. The variational model sheds some light on this regime by clarifying the role of the temperature in systems out of equilibrium, but there is some way to go before we understand issues associated with, for example, any “principle of extremal entropy production” and instabilities that lead to structure formation. Finally, despite the successes of the extended thermodynamics framework (Jou et al. 1993), there is no universal agreement concerning the validity (and usefulness) of the results. To some extent this is natural given the interdisciplinary nature of the problem. To make progress we need to account for both thermodynamical principles and fundamental General Relativity. This leads to questions concerning, in particular, the meaning of the variables involved in the different models (e.g., the entropy). The ultimate theory (if we imagine such a thing) should provide a clear link to statistical physics and even information theory. Our efforts are not yet at that level.

In the following we will summarize the current thinking by describing the main models from the literature. We first consider the classic work of Eckart (1940) and Landau and Lifshitz (1959), which follow as a seemingly natural extension of the inviscid equations. However, a detailed analysis of Hiscock and Lindblom (1985, 1987) has demonstrated that these descriptions have serious flaws and must be considered unsuitable for practical use. Still, it is relatively “easy” to extend them in the way proposed by Israel (1976), Stewart (1977) and Israel and Stewart (1979a, 1979b). Their description, the derivation of which was inspired by early work of Grad (1949) and Müller (1967) and which results from relativistic kinetic theory, provides a framework that is generally accepted as meeting the criteria for a relativistic model (Hiscock and Lindblom 1983). Next, we describe Carter’s more complete approach to the problem, which makes elegant use of the variational argument. The construction is also more general than that of, for example, Israel and Stewart. In particular, it shows how one would account for several dynamically independent interpenetrating fluid species. This extension is important for, for example, the consideration of relativistic superfluid systems. Finally, we connect with efforts motivated by string theory and consider recent progress on the development of an action principle for dissipative system, an approach that makes explicit use of the relevant matter space quantities.

16.1 Eckart versus Landau–Lifshitz

As in the heat problem (see Sect. 15) we consider a single particle system, with a conserved matter flux \(n^a\). However, we now allow for the possibility that we are not working in the matter frame. That is, we introduce a vector \(\nu ^a\) representing particle diffusion

$$\begin{aligned} n^a = n u^a + \nu ^a , \end{aligned}$$
(16.1)

and assume that the diffusion satisfies the constraint \(u_a \nu ^a = 0\) (there is no particle production so \(\nabla _a n^a = 0\)). This simply means that it is purely spatial according to an observer moving with the particles in the inviscid limit, exactly what one would expect from a diffusive process. Next we introduce the heat flux \(q^a\) (as before) and the viscous stress tensor, decomposed into a trace-part \(\tau \) (not to be confused with the proper time) and a trace-free piece \(\tau ^{a b}\), such that

$$\begin{aligned} T^{a b} = (p + \tau ) \perp ^{a b} + \varepsilon u^a u^b + 2 q^{(a} u^{b)} + \tau ^{a b} , \end{aligned}$$
(16.2)

subject to the constraints

$$\begin{aligned} u^a q_a = \tau ^a{}_a= & {} 0 , \end{aligned}$$
(16.3)
$$\begin{aligned} u^b \tau _{b a}= & {} 0 , \end{aligned}$$
(16.4)
$$\begin{aligned} \tau _{a b} - \tau _{b a}= & {} 0 . \end{aligned}$$
(16.5)

That is, both the heat flux and the trace-free part of the viscous stress tensor are spatial in the matter frame, and \(\tau ^{a b}\) is symmetric. So far, the description is quite general (cf. the general decomposition of the stress-energy tensor discussed in Sect. 5). The constraints have simply been imposed to ensure that the problem has the anticipated number of degrees of freedom.

The next step is to deduce the form for the additional fields from the second law of thermodynamics. Assuming that the entropy flux is a combination of all the available vectors, we have

$$\begin{aligned} s^a = s u^a + \beta q^a - \lambda \nu ^a , \end{aligned}$$
(16.6)

where \(\beta \) and \(\lambda \) are yet to be specified (although we know already what \(\beta \) will end up being from our previous discussion). It is easy to work out the divergence of \(s^a\). Then using the component of Eq. (6.35) along \(u^a\), and the usual (equilibrium) thermodynamic relation for an equation of state \(\varepsilon (n,s)\) (as in Sect. 2), we find that

$$\begin{aligned} \nabla _a s^a= & {} q^a \left( \nabla _a \beta - \frac{1}{T} u^b \nabla _b u_a \right) + \left( \beta - \frac{1}{T} \right) \nabla _a q^a \nonumber \\&- \left( x_\mathrm {s}+ \lambda - \frac{p + \varepsilon }{n T} \right) \nabla _a \nu ^a - \nu ^a \nabla _a \lambda - \frac{\tau }{T} \nabla _a u^a - \frac{\tau ^{a b}}{T} \nabla _a u_b . \end{aligned}$$
(16.7)

We want to ensure that the right-hand side of this equation is positive definite (or indefinite). An easy way to achieve this is to make the following identifications:

$$\begin{aligned} \beta = 1/T, \end{aligned}$$
(16.8)

and

$$\begin{aligned} \lambda = {1\over nT} (p+\varepsilon - sT) = {\mu \over T} \end{aligned}$$
(16.9)

We also identify

$$\begin{aligned} \nu ^a = - \sigma T^2 \perp ^{a b} \nabla _b \lambda , \end{aligned}$$
(16.10)

where the “diffusion coefficient” \(\sigma \ge 0\), and the projection is needed in order for the constraint \(u_a \nu ^a = 0\) to be satisfied. Furthermore, we find that the heat flux is given by the same expression as before (with \(\beta = 1/T\)) and we can use

$$\begin{aligned} \tau = -\zeta \nabla _a u^a , \end{aligned}$$
(16.11)

where \(\zeta \ge 0\) is the coefficient of bulk viscosity. To complete the description, we need to rewrite the final term in Eq. (16.7). To do this it is useful to note that the gradient of the four-velocity can generally be written (recall the discussion from Sect. 5)

$$\begin{aligned} \nabla _a u_b = \sigma _{a b} + \frac{1}{3} \perp _{a b} \theta + \varpi _{a b} - a_b u_a , \end{aligned}$$
(16.12)

with the usual four-acceleration, \(a_b= u^a \nabla _a u_b\), the expansion \(\theta = \nabla _a u^a\), and the shear

$$\begin{aligned} \sigma _{a b} = \frac{1}{2} \left( \perp ^c_b \nabla _c u_a + \perp ^c_a \nabla _c u_b \right) - \frac{1}{3} \perp _{a b} \theta . \end{aligned}$$
(16.13)

Finally, the “twist” follows fromFootnote 38

$$\begin{aligned} \varpi _{a b} = \frac{1}{2} \left( \perp ^c_b \nabla _c u_a - \perp ^c_a \nabla _c u_b \right) . \end{aligned}$$
(16.14)

Since we want \(\tau ^{a b}\) to be symmetric, trace-free, and purely spatial according to an observer moving along \(u^a\), it is useful to introduce the notation

$$\begin{aligned} \left\langle A_{a b} \right\rangle = \frac{1}{2} \perp ^c_a \perp ^d_b \left( A_{c d} + A_{d c} - \frac{2}{3} \perp _{c d} \perp ^{e f} A_{e f} \right) \end{aligned}$$
(16.15)

for any \(A_{a b}\). In the case of the gradient of the four-velocity, it is easy to show that this leads to

$$\begin{aligned} \left\langle \nabla _a u_b \right\rangle = \sigma _{a b} \end{aligned}$$
(16.16)

and therefore it is natural to use

$$\begin{aligned} \tau ^{a b} = - \eta \sigma ^{a b} , \end{aligned}$$
(16.17)

where \(\eta \ge 0\) is the shear viscosity coefficient. Given these relations, we have

$$\begin{aligned} T \, \nabla _a s^a = \frac{q^a q_a}{\kappa T} + \frac{\tau }{\zeta } + \frac{\nu ^a \nu _a}{\sigma T^2 } + \frac{\tau ^{a b} \tau _{a b}}{2 \eta } \ge 0 . \end{aligned}$$
(16.18)

By construction, the second law of thermodynamics is satisfied.

The model we have written down is quite general, especially since we did not yet specify the four-velocity \(u^a\). By doing this we can obtain both the formulation due to Eckart (1940) and that of Landau and Lifshitz (1959), see Sect. 5. To arrive at the Eckart description, we associate \(u^a\) with the flow of particles (as we did in the discussion of the heat problem). Thus we take \(\nu ^a = 0\) (or equivalently \(\sigma =0\)). This choice has the advantage of being easy to implement. The Landau and Lifshitz model follows if we instead choose the four-velocity to be a timelike eigenvector of the stress-energy tensor. From Eq. (16.2) it is easy to see that, by setting \(q^a = 0\), we get

$$\begin{aligned} u_b T^{b a} = - \varepsilon u^a . \end{aligned}$$
(16.19)

This is equivalent to setting \(\kappa = 0\). Unfortunately, these models, which have been used in many applications to date, are not that useful. While they pass the test set by the second law of thermodynamics, they fail other requirements of a relativistic description. In fact, a detailed analysis of perturbations away from an equilibrium state (Hiscock and Lindblom 1985) demonstrates serious pathologies. The dynamics of small perturbations tends to be dominated by rapidly growing instabilities. This suggests that these formulations may be practically useless. At the very least, they must be used with caution.

It has recently been argued that stability at linear order in a dissipative derivative expansion can be ensured by a judicious choice of frame (Kovtun 2019; Bemfica et al. 2019). The argument is based on a general expansion, followed by a stability analysis to demonstrate that there exist constraints on the expansion parameters such that these models meet the stability and causality requirements. Intuitively, this argument seems somewhat at odds with the covariant nature of Einsteins theory—the stability of a system should not depend on the chosen observer. Gavassino et al. (2020) adds to the discussion by showing that the instability of the Landau–Lifshitz/Eckart models is due to a failure to ensure maximum entropy at equilibrium. Meanwhile, the frame stabilized first-order models allow for violations of the second law. As neither of these represent the anticipated physics, the issue of stability at linear order remains open.

16.2 The Israel–Stewart approach

From the above discussion we learn that the most obvious strategy for extending relativistic hydrodynamics to include dissipation leads to unsatisfactory results. Let us now explain how this problem can be solved.

The original strategy was based on describing the entropy current \(s^a\) as a linear combination of the fluxes in the system, the four-velocity \(u^a\), the heat-flux \(q^a\) and the particle diffusion \(\nu ^a\). In a series of now classic papers, Israel (1976), Stewart (1977) and Israel and Stewart (1979a, 1979b) contrasted this “first-order” theory with relativistic kinetic theory. Following early work by Müller (1967) and connecting with Grad’s 14-moment kinetic theory description (Grad 1949), they concluded that a satisfactory model should be “second order” in the various fields. If we, for simplicity, work in the Eckart frame (cf. Hiscock and Lindblom 1983) this means that we would use

$$\begin{aligned} s^a= & {} s u^a + \frac{1}{T} q^a - \frac{1}{2 T} \left( \beta _0 \tau ^2 + \beta _1 q_b q^b + \beta _2 \tau _{b c} \tau ^{b c} \right) u^a \nonumber \\&+ \frac{\alpha _0 \tau q^a}{T} + \frac{\alpha _1 \tau ^a{}_b q^b}{T} . \end{aligned}$$
(16.20)

This expression is arrived at by asking what the most general form of a vector constructed from all the various fields in the problem may be. Of course, we now have a number of new (so far unknown) parameters. The three coefficients \(\beta _0\), \(\beta _1\), and \(\beta _2\) have a thermodynamical origin, while the two coefficients \(\alpha _0\) and \(\alpha _1\) represent the coupling between viscosity and heat flow. From the above expression, we see that in the frame moving with \(u^a\) the effective entropy density is given by

$$\begin{aligned} - u_a s^a = s - \frac{1}{2 T} \left( \beta _0 \tau ^2 + \beta _1 q_a q^a + \beta _2 \tau _{a b} \tau ^{a b} \right) . \end{aligned}$$
(16.21)

Since we want the entropy to be maximized in equilibrium, when the extra fields vanish, we must have \([\beta _0, \beta _1, \beta _2]\ge 0\). We also see that the entropy flux

$$\begin{aligned} \perp ^a_b s^b = \frac{1}{T} \left[ (1 + \alpha _0 \tau ) q^a + \alpha _1 \tau ^{a b} q_b \right] \end{aligned}$$
(16.22)

is affected only by the parameters \(\alpha _0\) and \(\alpha _1\).

Having made the assumption (16.20), the rest of the calculation proceeds as in Sect. 15. Working out the divergence of the entropy current, and making use of the equations of motion, we arrive at

$$\begin{aligned} \nabla _a s^a= & {} - \frac{1}{T} \tau \left[ \nabla _a u^a + \beta _0 u^a \nabla _a \tau - \alpha _0 \nabla _a q^a - \gamma _0 T q^a \nabla _a \left( \frac{\alpha _0}{T} \right) + \frac{\tau T}{2} \nabla _a \left( \frac{\beta _0 u^a}{T} \right) \right] \nonumber \\&- \frac{1}{T} q^a \left[ \frac{1}{T} \nabla _a T + u^b \nabla _b u_a + \beta _1 u^b \nabla _b q_a - \alpha _0 \nabla _a \tau - \alpha _1 \nabla _b \tau ^b{}_a \right. \nonumber \\&+ \left. \frac{T}{2} q_a \nabla _b \left( \frac{\beta _1 u^b}{T} \right) - (1 - \gamma _0) \tau T \nabla _a \left( \frac{\alpha _0}{T} \right) - (1 - \gamma _1) T \tau ^b{}_a \nabla _b \left( \frac{\alpha _1}{T} \right) \right] \nonumber \\&- \frac{1}{T} \tau ^{a b} \left[ \nabla _a u_b + \beta _2 u^c \nabla _c \tau _{a b} - \alpha _1 \nabla _a q_b + \frac{T}{2} \tau _{a b} \nabla _c \left( \frac{\beta _2 u^c}{T} \right) - \gamma _1 T q_a \nabla _b \left( \frac{\alpha _1}{T} \right) \right] .\nonumber \\ \end{aligned}$$
(16.23)

In this expression we have introduced (following Lindblom and Hiscock) two further parameters, \(\gamma _0\) and \(\gamma _1\). They are needed because, without additional assumptions, it is not clear how the “mixed” quadratic term should be distributed. A natural way to fix these parameters is to appeal to the Onsager symmetry principle (Israel and Stewart 1979b), which leads to the mixed terms being distributed “equally” so \(\gamma _0 = \gamma _1 = 1/2\).

Denoting the comoving time derivative by a dot, i.e., using \(u^a \nabla _a \tau = \dot{\tau }\) (as before) we see that the second law of thermodynamics is satisfied if we choose

$$\begin{aligned}&\tau = - \zeta \Bigg [ \nabla _a u^a + \beta _0 \dot{\tau } - \alpha _0 \nabla _a q^a \nonumber \\&\qquad - \gamma _0 T q^a \nabla _a \left( \frac{\alpha _0}{T} \right) + \frac{\tau T}{2} \nabla _a \left( \frac{\beta _0 u^a}{T} \right) \Bigg ], \end{aligned}$$
(16.24)
$$\begin{aligned}&q^a= -\kappa T \perp ^{a b} \Bigg [ \frac{1}{T} \nabla _b T + \dot{u}_b + \beta _1 \dot{q}_b - \alpha _0 \nabla _b \tau - \alpha _1 \nabla _c \tau ^c{}_b + \frac{T}{2} q_b \nabla _c \left( \frac{\beta _1 u^c}{T} \right) \nonumber \\&\qquad - (1 - \gamma _0) \tau T \nabla _b \left( \frac{\alpha _0}{T} \right) - (1 - \gamma _1) T \tau ^c{}_b \nabla _c \left( \frac{\alpha _1}{T}\right) + \gamma _2 \nabla _{[b} u_{c]} q^c \Bigg ], \end{aligned}$$
(16.25)
$$\begin{aligned}&\tau _{a b} = - 2 \eta \Bigg [ \beta _2 \dot{\tau }_{a b} + \frac{T}{2} \tau _{a b} \nabla _c \left( \frac{\beta _2 u^c}{T} \right) \nonumber \\&\quad \qquad + \left\langle \nabla _a u_b - \alpha _1 \nabla _a q_b - \gamma _1 T q_a \nabla _b \left( \frac{\alpha _1}{T} \right) + \gamma _3 \nabla _{[a} u_{c]} \tau _b{}^c \right\rangle \Bigg ], \end{aligned}$$
(16.26)

where the angular brackets denote symmetrization as before. In these expressions we have added yet another two terms, representing the coupling to \(\nabla _{[a} u_{b]}\). These bring two further “free” parameters, \(\gamma _2\) and \(\gamma _3\). We are allowed to add these terms since they do not affect the entropy production. In fact, a large number of similar terms may, in principle, be considered (see note added in proof in Hiscock and Lindblom 1983). The presence of coupling terms of the particular form that we have introduced is suggested by kinetic theory (Israel and Stewart 1979b).

What is clear from these (very complicated) expressions is that we now have evolution equations for the dissipative fields. Introducing characteristic “relaxation” times

$$\begin{aligned} t_0 = \zeta \beta _0, \qquad t_1 = \kappa \beta _1, \qquad t_2 = 2 \eta \beta _2, \end{aligned}$$
(16.27)

the above equations can be written

$$\begin{aligned} t_0 \dot{\tau } + \tau= & {} -\zeta [\dots ] , \end{aligned}$$
(16.28)
$$\begin{aligned} t_1 \perp ^a_b \dot{q}^b + q^a= & {} -\kappa T \perp ^a_b [\dots ] ,\end{aligned}$$
(16.29)
$$\begin{aligned} t_2 \dot{\tau }_{a b} + \tau _{a b}= & {} - 2\eta [\dots ] . \end{aligned}$$
(16.30)

A detailed stability analysis by Hiscock and Lindblom (1983) shows that the theory is causal for stable fluids. Then the characteristic velocities are subluminal and the equations form a hyperbolic system. An interesting aspect of the analysis concerns the stabilizing role of the extra parameters (\(\beta _0,\dots ,\alpha _0,\dots \)). Relevant discussions of the implications for the nuclear equation of state and the maximum mass of neutron stars have been provided by Olson and Hiscock (1989b) and Olson (2001). A more detailed mathematical stability analysis can be found in the work of Kreiss et al. (1997).

Although the Israel–Stewart model resolves the problems of the first-order descriptions for near equilibrium situations, issues remain to be understood for nonlinear problems. This is highlighted in work by Hiscock and Lindblom (1988) and Olson and Hiscock (1989a). They consider nonlinear heat conduction and show that the Israel–Stewart formulation becomes non-causal and unstable for sufficiently large deviations from equilibrium. The problem appears to be more severe in the Eckart frame (Hiscock and Lindblom 1988) than in the frame advocated by Olson and Hiscock (1989a). The fact that the formulation breaks down in a nonlinear setting is not too surprising. After all, the basic foundation is a “Taylor expansion” in the various fields. However, it raises important questions as there are obvious physical situations where a reliable nonlinear model may be crucial, e.g., heavy-ion collisions and supernova core collapse.

16.3 Application: heavy-ion collisions

Relativistic fluid dynamics has regularly been used as a tool to model heavy ion collisions. The idea of using hydrodynamics to study the process of multiparticle production in high-energy hadron collisions can be traced back to work by, in particular, Landau in the early 1950s (see Belenkij and Landau 1955). In the early days these phenomena were observed in cosmic rays. The idea to use hydrodynamics was resurrected as collider data became available (Carruthers 1974) and early simulations were carried out at Los Alamos (Amsden et al. 1975, 1977). More recently, modelling has primarily been focussed on reproducing data from RHIC at Brookhaven and the LHC at CERN. Useful reviews of this active area of research can be found in Clare and Strottman (1986), Romatschke (2010a), Busza et al. (2018) and Romatschke and Romatschke (2019).

From the hydrodynamics perspective, a high-energy collision may be viewed in the following way: In the centre-of-mass frame two Lorentz contracted nuclei collide—at the typical energy of a nucleus-nucleus collision at RHIC (order 100 GeV per nucleon), each incoming nucleus is contracted by factor of about 100, making them thin colliding pancakes. After a complex microscopic process, a hot dense plasma is formed. In the simplest description this matter is assumed to be in local thermal equilibrium. The initial thermalization phase is out of reach for hydrodynamics. In the model, the state of matter is simply specified by the initial conditions, e.g., in terms of distributions of fluid velocities and thermodynamical quantities. Then follows a hydrodynamical expansion, which is described by the standard conservation equations for energy, momentum, baryon number, and other conserved quantities, such as strangeness, isotope spin, etc. (see Elze et al. 1999 for a variational principle derivation of these equations). As the expansion proceeds, the fluid cools and becomes increasingly rarefied. This stage may require a kinetic theory description. This eventually leads to the decoupling of the constituent particles, which then do not interact until they reach the detector.

Fluid dynamics provides a well defined framework for studying the stages during which matter becomes highly excited and compressed and, later, expands and cools down. In the final stage—when the nuclear matter is so dilute that collisions are infrequent—hydrodynamics ceases to be valid. At this point additional assumptions are necessary to predict the number of particles, and their energies, which may be formed (to be compared to data obtained from the detector). These are often referred to as the “freeze-out” conditions. The problem is complicated by the fact that the “freeze-out” typically occurs at a different time for each fluid cell.

Even though the application of hydrodynamics in this area has led to useful results, the theoretical foundation for this description is not a trivial matter. Basically, the criteria required for the equations of hydrodynamics to be valid are:

  1. 1.

    many degrees of freedom in the system,

  2. 2.

    a short mean free path,

  3. 3.

    a short mean stopping length,

  4. 4.

    a sufficient reaction time for thermal equilibration, and

  5. 5.

    a short de Broglie wavelength (so that quantum mechanics can be ignored).

An interesting aspect of the hydrodynamical description is that it makes use of concepts largely outside traditional nuclear physics, e.g., thermodynamics, statistical mechanics, fluid dynamics, and of course elementary particle physics. This is natural since the very hot, highly excited matter has a large number of degrees of freedom. But it is also a reflection of the basic lack of knowledge. As the key dynamics is uncertain, it is comforting to resort to familiar principles like the conservation of momentum and energy.

Another key reason why hydrodynamic models are favoured is the simplicity of the input. Apart from initial conditions that specify masses and velocities, one needs only an equation of state and an Ansatz for the thermal degrees of freedom. If one includes dissipation one must also specify the form and magnitude of the viscosity and heat conduction. The fundamental conservation laws are incorporated into the Euler equations. In return for this relatively modest amount of input, one obtains the differential cross sections of all the final particles, the composition of clusters, etc. Of course, before one can confront the experimental data, one must make additional assumptions about the freeze-out, chemistry, and so on. A clear disadvantage of the hydrodynamics model is that much of the microscopic dynamics is lost.

Let us discuss some specific aspects of the hydrodynamics that has been used in this area. As we will recognize, the issues that need to be addressed for heavy-ion collisions are very similar to those faced in studies of relativistic dissipation theory and multi-fluid modelling. The one key difference is that the problem only requires Special Relativity, so there is no need to worry about the spacetime geometry. Of course, it is still convenient to use a fully covariant description since one is then not tied down to the use of a particular set of coordinates.

In many studies of heavy ions a particular frame of reference is chosen. As we have already seen, this is an issue that must be approached with some care. In the context of heavy-ion collisions it is common to choose \(u^a\) as the velocity of either energy transport (the Landau–Lifshitz frame) or particle transport (the Eckart frame). We have encountered both choices before. It is recognized that the Eckart formulation is somewhat easier to use and that one can let \(u^a\) be either the velocity of nucleon or baryon number transport. On the other hand, there are cases where the Landau–Lifshitz picture has been viewed as more appropriate. For instance, when ultra-relativistic nuclei collide they virtually pass through one another leaving the vacuum between them in a highly excited state causing the creation of numerous particle-antiparticle pairs. Since the net baryon number in this region vanishes, the Eckart definition of the four-velocity cannot be easily employed. This discussion is a reminder of the situation for viscosity in relativity, and the resolution is likely the same. A true frame-independent description will need to include several distinct fluid components.

Multi-fluid models have, in fact, been considered for heavy-ion collisions. One can, for example, treat the target and projectile nuclei as separate fluids to admit interpenetration, thus arriving at a two-fluid model. One could also use a relativistic multi-fluid model to allow for different species, e.g., nucleons, deltas, hyperons, pions, kaons, etc. Such a model could account for the varying dynamics of the different species, as well as their mutual diffusion and chemical reactions. The derivation of such a model would follow closely our discussion in Sect. 9. In the heavy-ion community, it has been common to confuse the issue somewhat by insisting on choosing a particular local rest frame at each space-time point. This is, of course, complicated since the different fluids move at different speeds relative to any given frame. For the purpose of studying heavy-ion collisions in baryon-rich regions of space, the standard option seems to be to define the “baryonic Lorentz frame”. This is the local Lorentz frame in which the motion of the center-of-baryon number (analogous to the center-of-mass) vanishes.

The main problem with the single-fluid hydrodynamics model is the requirement of thermal equilibrium. In the fluid equations of motion it is implicitly assumed that local thermal equilibrium is “imposed” via the equation of state. In effect, the relaxation timescale and the mean-free path must be much smaller than both the hydrodynamical timescale and the spatial size of the system. It seems reasonable to wonder if these conditions can be met for hadron/nuclear collisions. On the other hand, from the kinematical point of view (apart from the use of the equation of state), the equations of hydrodynamics are nothing but conservation laws of energy and momentum, together with other conserved quantities such as charge. In this sense, for any process where the dynamics of the flow is an important factor, a hydrodynamical framework is a natural first step. The effects of a finite relaxation time and mean-free path might be implemented later by using an effective equation of state, incorporating viscosity and heat conductivity, or some simplified transport equations. This does, of course, lead us back to the challenging problem of designing a causal relativistic theory for dissipation. A discussion of numerical efforts can be found in Romatschke (2010a). It is notable that very few calculations have been performed using a fully three-dimensional, relativistic theory with dissipation. Considering the obvious importance of entropy, this may seem surprising (although see Kapusta 1981 for an exception). An interesting comparison of different dissipative formulations is also provided in Muronga (2002, 2004).

16.4 The fluid-gravity correspondence

The continued effort to explore the complex marriage between gravity and quantum theory has also led to (perhaps unexpected) developments in the modelling and understanding of relativistic fluids. The context for these developments is the AdS/CFT correspondence (Maldacena 1998), relating the dynamics of a four-dimensional conformal field theory to (quantum) gravity in ten dimensions. The most commonly considered case—in essence the “harmonic oscillator” of the problem—relates to the duality between SU(N) \(\mathcal {N}=4\) Super Yang-Mills theory and Type IIB string theory on AdS\(_5\times \)S\(^5\). In general, these are both complicated theories, but the phenomenology simplifies in certain limits. The idea is attractive because it links a strongly coupled theory, for which perturbative calculations are not an option, to a weakly coupled system, for which one may be able to make progress. This is the reason why AdS-CFT is referred to as a duality—the two descriptions are valid in opposite regimes. However, this makes the duality difficult to check. In one regime we can calculate, but not in the other.

It is attractive to apply the idea to the state of matter explored in colliders—the quark-gluon plasma. At the energies reached in experiments, the plasma is far from a weakly coupled gas of quarks and gluons. The system is well inside the non-perturbative regime of QCD, where reliable tools are lacking. The AdS-CFT approach offers an avenue towards progress by reformulating the strongly coupled quantum systems as a dynamical problem in classical gravity. Perhaps the most important insight from this concerns the apparent universality of transport coefficients in gravity duals and the so-called entropy bound—the notion that for all thermal field theories (in the regime described by gravity duals) the ratio of shear viscosity to entropy density is bounded by (Son and Starinets 2007)

$$\begin{aligned} {\eta \over s} \ge {1\over 4\pi } . \end{aligned}$$
(16.31)

If correct, this implies that a fluid with a given volume density of entropy cannot be arbitrarily close to being a perfect fluid (which would have zero viscosity).

The AdS-CFT correspondence is holographic in the sense that the two dual theories live in a different number of dimensions. Effectively, the gauge theory lives “on the boundary” of AdS. The formalism provides a “dictionary” that translates dynamical gauge theory questions into the geometrical language associated with higher-dimensional General Relativity, providing intriguing links between the two—traditionally separate—areas of research. Moreover, one can show that long-wavelength solutions to the Einstein equations with a negative cosmological constant (AdS) are dual to solutions of the four-dimensional fluid equations with a conformal symmetry. This has led to what is known as the fluid-gravity correspondence (Rangamani 2009). The idea ties in with the fact that hydrodynamics may be viewed as an effective theory that governs the macroscopic behaviour of a system, on scales larger than some characteristic “averaging” scale (like the mean-free path).

In practice, the fluid-gravity correspondence links a fluid system to the near-horizon dynamics of a higher dimensional black hole. This connection follows from the AdS-CFT correspondence, but at the same time it is somewhat separate from it. In fact, the connection between black holes and fluids/thermodynamics is not new at all—it dates back to the 1970s. Early work by, in particular, Bekenstein (1973) and Hawking (1975), led to the appreciation that stationary black hole horizons have thermodynamic properties like temperature and entropy and the formulation of a generalized second law of thermodynamics that treats black-hole entropy on a par with the usual matter entropy (Bardeen et al. 1973). This was followed by studies of analogue models of black holes (Unruh 1981), illustrating that fluids can admit sonic horizons and even a version of the Hawking temperature. Finally, through the membrane paradigm (Damour 1978; Thorne et al. 1986) it was demonstrated that (for external observers) black holes behave like a fluid membrane, endowed with physical properties such as viscosity and electrical conductivity (see Gourgoulhon 2005 for a more recent discussion of this “horizon fluid”).

The fluid-gravity correspondence takes the discussion to a different level, beyond the identification of holographic duals for given equilibrium field theory configurations, to a discussion of dynamics and dissipation. As it is instructive to understand how this comes about, let us consider a relatively simple example (Hubeny 2011). Starting from an equilibrium black-hole solution we can generate a four-parameter family of solutions by scaling the radial coordinate r and introducing a boost associated with a four-velocity \(u^a\). Also introducing ingoing Eddington–Finkelstein type coordinates we ensure that the metric is regular on the horizon. This leads to the planar Schwarzschild-AdS\(_5\) black hole taking the form (Hubeny 2011)

$$\begin{aligned} ds^2 = - 2 u_a dx^a dr + r^2 \left( \eta _{ab} + {\pi ^4 T^4 \over r^4} u_a u_b\right) dx^a dx^b , \end{aligned}$$
(16.32)

notably expressed in terms of the temperature T and \(u^a\). The boundary stress tensor induced by this (bulk) metric is (in suitable units)

$$\begin{aligned} T^{ab} = \pi ^4 T^4 (\eta ^{ab} + 4 u^a u^b ) . \end{aligned}$$
(16.33)

Effectively, we have a perfect fluid with energy \(\varepsilon =3 \pi ^4 T^4\) and pressure \(p=\varepsilon /3\), moving with velocity \(u^a\) on the flat four- dimensional background, \(\eta _{ab}\). The stress tensor is traceless, as expected for a conformal fluid. Also, there is no dissipation in the system. This is natural since we still have an equilibrium solution. Let us now change this by perturbing the spacetime. This obviously leads to deviations from equilibrium, but we may execute the right to move the perturbed aspects of the metric to the other side of the equation and “interpret” them as contributions to the stress-energy tensor.Footnote 39 This leads to a time-dependent non-equilibrium fluid system, relaxing towards equilibrium as it evolves. The relaxation/thermalization can be understood through an expansion in “boundary derivatives”, leading to distinct dissipation channels (like shear viscosity). The relevant transport coefficients may be extracted in this linearized regime, and one finds that they can be associated with the quasinormal modesFootnote 40 of the (planar AdS) black hole (Horowitz and Hubeny 2000; Son and Starinets 2007). This is conceptually interesting as it relates a problem in classical gravity to fluid behaviour.

Let us consider the implications of this argument. The holographic dictionary associates low-energy phenomena to the near horizon dynamics. We arrive at the usual argument describing fluid dynamics as an effective field theory for long wavelengths, albeit from an unusual angle. Still, the logic is intuitive. For a value to be assigned to the temperature T at a given point, a fluid must have reached a local equilibrium. Basically, in order to insert a thermometer into the system to measure the temperature, the device must be able to reach some kind of equilibrium with the system. In order for this to work, we do not need a global equilibrium, but we must insist that any variations take place on a scale larger than that associated with the thermometer and the measurement. This naturally leads us to consider a long-wavelength expansion of the dynamics and a systematic expansion in derivatives (organized order by order to represent shorter scales), representing dissipative phenomena. Logically, this is close to writing down an effective field theory for a quantum system, at any given order taking into account all possible terms (derivatives) that may appear in the effective Lagrangian, consistent with the underlying symmetry.

AdS-CFT and the fluid-gravity correspondence have led to progress in several interesting directions. In addition to efforts to explore issues relating to the entropy bound (16.31), work has been done to construct the bulk duals of non-conformal fluids (Kanitscheider and Skenderis 2009), charged fluids (Erdmenger et al. 2009; Banerjee et al. 2011), superfluids (Sonner and Withers 2010; Bhattacharya et al. 2014; Herzog et al. 2011) and anomalous fluids (Banerjee et al. 2014). The latter relate to the observation that some AdS black holes exhibit an instability that leads to the spontaneous formation of a scalar condensate below a critical temperature \(T_c\), in analogy with the phase-transition seen in many low-temperature laboratory systems. Not surprisingly, the more complicated the fluid system is, the more involved the gravity problem becomes. A typical example is the dissipative superfluid system considered by Bhattacharya et al. (2011), which involves a map from locally hairy black brane solutions to the long wavelength solutions of higher-dimensional Einstein–Maxwell gravity and a phase where the global U(1) symmetry is spontaneously broken (as required to facilitate superfluid flow). Similarly, a gravitational dual to a (type II) superconductor can be obtained by coupling AdS gravity to a Maxwell field and a charged scalar (Gubser 2008; Hartnoll et al. 2008a, b). These developments are interesting given that condensed matter physics involves a variety of strongly coupled systems—often with unusual properties—that can be engineered and explored in detail in laboratories (Hartnoll 2009).

16.5 Completing the derivative expansion

Taken at face value, the field theory approach to fluid dynamics prompts us to focus on the underlying symmetries (see Sect. 6.4) and this has implications for a systematic derivative expansion aimed at representing dissipative effects. In practice, it means that—rather than introducing second order terms in order to fix the causality/stability issues of the first-order description—it is natural to ask what form second order terms may take, what the most general such model may be and how it is constrained by symmetries (e.g., of the dissipative stress-energy tensor) (Romatschke 2010b, a). Given the connection to AdS-CFT most efforts in this direction have focussed on conformal fluids, which (obviously) leaves out compressional degrees of freedom associated with bulk viscosity. Nevertheless, it is clear that the general dissipative second-order system must include a large set of parameters (Romatschke 2010b). It is interesting to note that, at second order the formal argument brings in coupling to the spacetime curvature. At first order, there can be no such terms since we require \(\nabla _a g_{bc} = 0\), but second derivatives of the metric do not vanish so they could (perhaps should) be considered. In particular, we may have terms proportional to the Ricci scalar, R, and the contraction of the Ricci tensor with the fluid four-velocity, \(u^a u^b R_{ab}\) (Baier et al. 2008). The presence of such terms may come as a surprise, but they have been motivated by holographic arguments. At the same time, the situation seems a little bit confusing. By adding terms involving the Ricci tensor to the dissipative stress-energy tensor we introduce aspects that could equally well belong on the left-hand side of the Einstein equations. That is, we are modifying gravity into the general f(R) class of theories (see for example Baier et al. 2019). This logic is supported by the observation that the specific terms are non-dissipative (Romatschke 2010b). This argument does not suggest that we should not account for these kinds of terms in a formal description, simply that we need to make more effort to understand why they should be present and what their role may be. In fact, this conclusion holds in a wider sense. General dissipative models include so many parameters—most of which we do not have any way of calculating from first principles—that they are difficult to use in applications. Developments in this direction are important but it would perhaps make sense to shift the focus from generality to specific questions concerning the manifestation of particular dissipation channels in settings of practical interest.

figure z

16.6 Carter’s canonical framework

Carter (1991) made a more formal attempt to construct a relativistic formalism for dissipative fluids—taking the variational argument as its starting point. His construction is quite general, which inevitably makes it more complex. Of course, the generality could prove useful in more complicated cases, e.g., for investigations of multi-fluid dynamics and/or elastic media. Given the potential this formalism has for future considerations, it is worth working through the details.

The overall aim is to extend the variational formulation in such a way that viscous “stresses” are accounted for. Because the variational foundations are the same, the number currents \(n_{\mathrm {x}}^a\) play a central role. In addition, we introduce a number of viscosity tensors \(\tau ^{a b}_\varSigma \), which we assume to be symmetric (even though it is clear that such an assumption is not generally correct, it is only the total stress-energy tensor that is required to be symmetric; Andersson and Comer 2006). The index \(\varSigma \) is “analogous” to the constituent index, although a bit more abstract as it represents different viscosity contributions. It is introduced in recognition of the fact that it may be advantageous to consider different kinds of viscosity, e.g., bulk and shear viscosity, separately. As in the case of the constituent index, a repeated index \(\varSigma \) does not imply summation in the following.

The key quantity in the variational framework remains the Lagrangian, \(\varLambda \). As it is a function of all the available fields, we now have \(\varLambda (n_{\mathrm {x}}^a, \tau _\varSigma ^{a b}, g_{a b})\), and a formal variation leads to

$$\begin{aligned} \delta \varLambda = \sum _{\mathrm {x}}\mu _a^{\mathrm {x}}\, \delta n_{\mathrm {x}}^a + \frac{1}{2} \sum _\varSigma \pi ^\varSigma _{a b} \, \delta \tau _\varSigma ^{a b} + \frac{\partial \varLambda }{\partial g^{a b}} \delta g^{a b} . \end{aligned}$$
(16.34)

Since the metric piece is treated in the same way as in the non-dissipative problem we will leave it out from now on. In the above expression we recognize the momenta \(\mu ^{\mathrm {x}}_a\) that are conjugate to the fluxes. We also have a new set of “strain” variables (cf. the discussion of elasticity in Sect. 12) defined by

$$\begin{aligned} \pi ^\varSigma _{a b} = \pi ^\varSigma _{(a b)} = \left. 2 \frac{\partial \varLambda }{\partial \tau ^{a b}_\varSigma } \right| _{n_{\mathrm {x}}^a, g^{ab}} . \end{aligned}$$
(16.35)

As in the non-dissipative case, the variational framework suggests that the equations of motion can be written as a force-balance equation,

$$\begin{aligned} \nabla _b T^b{}_a = \sum _{\mathrm {x}}f_a^{\mathrm {x}}+ \sum _\varSigma f^\varSigma _a = 0 , \end{aligned}$$
(16.36)

where the generalized forces work out to be

$$\begin{aligned} f_a^{\mathrm {x}}= \mu _a^{\mathrm {x}}\nabla _b n_{\mathrm {x}}^b + n_{\mathrm {x}}^b \nabla _{[b} \mu _{a]}^{\mathrm {x}}, \end{aligned}$$
(16.37)

(as before), and

$$\begin{aligned} f_a^\varSigma = \pi _{a b}^\varSigma \nabla _c \tau _\varSigma ^{c b} + \tau _\varSigma ^{c b} \left( \nabla _c \pi ^\varSigma _{a b} - \frac{1}{2} \nabla _a \pi ^\varSigma _{c b} \right) . \end{aligned}$$
(16.38)

Finally, the stress-energy tensor becomes

$$\begin{aligned} T^a{}_b = \varPsi \delta ^a{}_b + \sum _{\mathrm {x}}\mu ^\mathrm {x}_b n_{\mathrm {x}}^a + \sum _\varSigma \tau _\varSigma ^{a c} \pi ^\varSigma _{c b} , \end{aligned}$$
(16.39)

with the generalized pressure now given by

$$\begin{aligned} \varPsi = \varLambda - \sum _{\mathrm {x}}\mu _a^{\mathrm {x}}n^a_{\mathrm {x}}- \frac{1}{2} \sum _\varSigma \tau _\varSigma ^{a b} \pi ^\varSigma _{a b} . \end{aligned}$$
(16.40)

For reasons that will become clear shortly—basically, we want to be able to ensure that the different contributions to the entropy change are non-negative—it is useful to introduce a set of “convection vectors”. In the case of the currents, these are naturally taken as proportional to the fluxes (as usual). This means that we introduce \(\beta _{\mathrm {x}}^a\) such that

$$\begin{aligned} h_{\mathrm {x}}\beta _{\mathrm {x}}^a = n_{\mathrm {x}}^a , \qquad \mu ^{\mathrm {x}}_a \beta _{\mathrm {x}}^a = -1 \qquad \Longrightarrow \qquad h_{\mathrm {x}}= - \mu _a^{\mathrm {x}}n_{\mathrm {x}}^a , \end{aligned}$$
(16.41)

and we see that, if we ignore entrainment then \(h_{\mathrm {x}}\) is simply the chemical potential \(\mu _{\mathrm {x}}\) measured by an observer riding along with the flow of the \({\mathrm {x}}\) component. With this definition we can introduce a projection operator

$$\begin{aligned} \perp _{\mathrm {x}}^{a b} = g^{a b} + \mu _{\mathrm {x}}^a \beta _{\mathrm {x}}^b \qquad \Longrightarrow \qquad \perp _{{\mathrm {x}}b}^a \beta _{\mathrm {x}}^b = \perp _{\mathrm {x}}^{a b} \mu ^{\mathrm {x}}_b = 0 . \end{aligned}$$
(16.42)

From the definition of the force density \(f_{\mathrm {x}}^a\) we can then show that

$$\begin{aligned} \nabla _a n_{\mathrm {x}}^a = - \beta _{\mathrm {x}}^a f^{\mathrm {x}}_a , \end{aligned}$$
(16.43)

and

$$\begin{aligned} h_{\mathrm {x}}\mathcal{L}_{\mathrm {x}}\mu _a^{\mathrm {x}}= \perp ^{\mathrm {x}}_{a b} f_{\mathrm {x}}^b , \end{aligned}$$
(16.44)

where \(\mathcal{L}_{\mathrm {x}}= \mathcal{L}_{\beta _{\mathrm {x}}^a}\) represents the Lie-derivative along \(\beta _{\mathrm {x}}^a\). We see that the component of the force parallel to the convection vector \(\beta _{\mathrm {x}}^a\) is associated with particle conservation. Meanwhile, the orthogonal component represents the change in momentum along \(\beta _{\mathrm {x}}^a\).

Next, we facilitate a similar decomposition for the viscous stresses by taking the conduction vector to be a unit null eigenvector (cf. (5.49)) associated with \(\pi _\varSigma ^{a b}\). That is, we introduce \( \beta _\varSigma ^b\) such that

$$\begin{aligned} \pi ^\varSigma _{a b} \beta _\varSigma ^b = 0 , \end{aligned}$$
(16.45)

together with

$$\begin{aligned} u_a^\varSigma = g_{a b} \beta _\varSigma ^b \qquad \mathrm {and} \qquad u_a^\varSigma \beta _\varSigma ^a = - 1 . \end{aligned}$$
(16.46)

Introducing the projection associated with this conduction vector,

$$\begin{aligned} \perp ^\varSigma _{a b} = g_{a b} + u^\varSigma _a u^\varSigma _b , \end{aligned}$$
(16.47)

we (naturally) have

$$\begin{aligned} \perp ^\varSigma _{a b} \beta _\varSigma ^b = 0 . \end{aligned}$$
(16.48)

Once we have introduced \(\beta _\varSigma ^a\), we can use it to reduce the degrees of freedom of the viscosity tensors. So far, we have only required them to be symmetric. However, in the standard case one would expect a viscous tensor to have only six degrees of freedom. To ensure that this is the case we introduce the degeneracy condition

$$\begin{aligned} u_b^\varSigma \tau _\varSigma ^{b a} = 0 . \end{aligned}$$
(16.49)

That is, we require the viscous tensor \(\tau _\varSigma ^{a b}\) to be purely spatial according to an observer moving along \(u^a_\varSigma \). With these definitions one can show that

$$\begin{aligned} \beta _\varSigma ^a \mathcal{L}_\varSigma \pi ^\varSigma _{a b} = 0 , \end{aligned}$$
(16.50)

where \(\mathcal{L}_\varSigma = \mathcal{L}_{\beta _\varSigma ^a}\) is the Lie-derivative along \(\beta _\varSigma ^a\), and

$$\begin{aligned} \tau _\varSigma ^{a b} \mathcal{L}_\varSigma \pi ^\varSigma _{a b} = - 2 \beta _\varSigma ^a f^\varSigma _a . \end{aligned}$$
(16.51)

Finally, let us suppose that we choose to work in a given observer frame, moving with four-velocity \(u^a\) (associated with the usual projection \(\perp ^a_b\)). Then we can use the decompositions:

$$\begin{aligned} \beta _{\mathrm {x}}^a = \beta _{\mathrm {x}}\left( u^a + v_{\mathrm {x}}^a\right) \qquad \mathrm {and} \qquad \beta _\varSigma ^a = \beta _\varSigma \left( u^a + v_\varSigma ^a\right) . \end{aligned}$$
(16.52)

As expected, \(\mu ^{\mathrm {x}}= 1/\beta _{\mathrm {x}}\) represents a chemical type potential for species \({\mathrm {x}}\) with respect to the chosen frame. At the same time, we see that \(\mu ^\varSigma = 1/\beta _\varSigma \) is a Lorentz factor. Using the norm of \(\beta ^a_\varSigma \) we have

$$\begin{aligned} \beta ^a_\varSigma \beta ^\varSigma _a = - \beta ^2_\varSigma \left( 1 - v_\varSigma ^2 \right) = - 1 , \end{aligned}$$
(16.53)

where \(v_\varSigma ^2 = v_\varSigma ^a v^\varSigma _a\). Thus

$$\begin{aligned} \mu ^\varSigma = 1/\beta _\varSigma = \sqrt{ 1 - v_\varSigma ^2}, \end{aligned}$$
(16.54)

is analogous to the standard Lorentz factor.

So far the construction is quite formal. Let us now try to make it more intuitive by making contact with the physics. First, we note that the above results allow us to demonstrate that

$$\begin{aligned} u^b \nabla _a T^a{}_b= & {} - \sum _{\mathrm {x}}\left( \mu ^{\mathrm {x}}\nabla _a n_{\mathrm {x}}^a + v_{\mathrm {x}}^a f^{\mathrm {x}}_a \right) \nonumber \\&- \sum _\varSigma \left( v_\varSigma ^a f^\varSigma _a - \frac{1}{2} \mu ^\varSigma \tau _\varSigma ^{b a} \mathcal{L}_\varSigma \pi ^\varSigma _{b a} \right) = 0 . \end{aligned}$$
(16.55)

Recall that similar results were central to expressing the second law of thermodynamics in Sect. 15. To see how things work out in the present case, and make contact with the previous discussion, let us single out the entropy fluid (with index \(\mathrm {s}\)) by defining \(s^a = n_\mathrm {s}^a\) and \(T = \mu _\mathrm {s}\). To simplify the final expressions it is also useful to assume that the remaining species are governed by conservation laws of the form

$$\begin{aligned} \nabla _a n_{\mathrm {x}}^a = \varGamma _{\mathrm {x}}, \end{aligned}$$
(16.56)

subject to the constraint of total baryon number conservation; i.e.,

$$\begin{aligned} \nabla _a n^a = \nabla _a \sum _{{\mathrm {x}}\ne \mathrm {s}} n_{\mathrm {x}}^a = \sum _{{\mathrm {x}}\ne \mathrm {s}} \varGamma _{\mathrm {x}}= 0. \end{aligned}$$
(16.57)

Given this, and the fact that the divergence of the stress-energy tensor must vanish, we have

$$\begin{aligned} T \nabla _a s^a = - \sum _{{\mathrm {x}}\ne \mathrm {s}} \mu ^{\mathrm {x}}\varGamma _{\mathrm {x}}- \sum _{\mathrm {x}}v_{\mathrm {x}}^a f^{\mathrm {x}}_a - \sum _\varSigma \left( v_\varSigma ^a f^\varSigma _a + \frac{1}{2} \mu ^\varSigma \tau ^{a b}_\varSigma \mathcal{L}_\varSigma \pi ^\varSigma _{a b} \right) . \end{aligned}$$
(16.58)

Here we can bring the remaining two force contributions together by introducing the linear combination

$$\begin{aligned} \sum _{\mathrm {x}}\zeta ^{\mathrm {x}}_\varSigma v_{\mathrm {x}}^a = v_\varSigma ^a , \quad \mathrm {with} \quad \sum _{\mathrm {x}}\zeta ^{\mathrm {x}}_\varSigma = 1 . \end{aligned}$$
(16.59)

Then defining

$$\begin{aligned} \tilde{f}^{\mathrm {x}}_a = f^{\mathrm {x}}_a + \sum _\varSigma \zeta ^{\mathrm {x}}_\varSigma f^\varSigma _a , \end{aligned}$$
(16.60)

we have

$$\begin{aligned} T \nabla _a s^a = - \sum _{{\mathrm {x}}\ne \mathrm {s}} \mu ^{\mathrm {x}}\varGamma _{\mathrm {x}}- \sum _{\mathrm {x}}v_{\mathrm {x}}^a \tilde{f}^{\mathrm {x}}_a - \frac{1}{2} \sum _\varSigma \mu ^\varSigma \tau ^{a b}_\varSigma \mathcal{L}_\varSigma \pi ^\varSigma _{a b} \ge 0 . \end{aligned}$$
(16.61)

The three terms in this expression represent, respectively, the entropy increase due to (i) chemical reactions, (ii) conductivity, and (iii) viscosity. The simplest way to ensure that the second law of thermodynamics is satisfied is to make each term positive definite.

At this point, the formalism must be completed by some (suitably simple) model for the various terms. A reasonable starting point would be to assume that each term represents a linear deviation from equilibrium. For the chemical reactions this would mean that we expand each \(\varGamma _{\mathrm {x}}\) according to

$$\begin{aligned} \varGamma _{\mathrm {x}}= - \sum _{{\mathrm {y}}\ne s} \mathcal{C}_{{\mathrm {x}}{\mathrm {y}}} \mu ^{\mathrm {y}}, \end{aligned}$$
(16.62)

where \(\mathcal{C}_{{\mathrm {x}}{\mathrm {y}}}\) is a positive definite (or indefinite) matrix composed of the various reaction rates. Similarly, for the conductivity term it is natural to consider “standard” resistivity such that

$$\begin{aligned} \tilde{f}^{\mathrm {x}}_a = - \sum _{{\mathrm {y}}} \mathcal{R}^{{\mathrm {x}}{\mathrm {y}}}_{a b} v_{\mathrm {y}}^b . \end{aligned}$$
(16.63)

Finally, for the viscosity we can postulate a law of form

$$\begin{aligned} \tau _\varSigma ^{a b} = - \eta ^{a b c d}_\varSigma \mathcal{L}_\varSigma \pi ^\varSigma _{c d} , \end{aligned}$$
(16.64)

where we would have, for an isotropic model,

$$\begin{aligned} \eta ^{a b c d}_\varSigma = \eta \perp _\varSigma ^{a (c} \perp _\varSigma ^{d) b} + \frac{1}{3} (\eta - \zeta ) \perp _\varSigma ^{a b} \perp _\varSigma ^{c d} , \end{aligned}$$
(16.65)

and the coefficients \(\eta \) and \(\zeta \) are identified as representing shear and bulk viscosity, respectively.

A detailed comparison between Carter’s formalism and the Israel–Stewart framework has been carried out by Priou (1991). He concludes that the two models, which are both members of a larger family of dissipative models, have essentially the same degree of generality and that they are equivalent in the limit of linear perturbations away from a thermal equilibrium state. Providing explicit relations between the main parameters in the two descriptions, he also emphasizes the key point that analogous parameters may not have the same physical interpretation.

16.7 Add a bit of chemistry...

With the formal model development (at least at some level) in hand, it is natural to turn to the issue of the different dissipation coefficients. This effort has several different aspects. We may, for example, dig deeper and try to calculate the coefficients from some more fundamental—presumably microscopic—theory. At the same time, we may ask (still in the somehwat phenomenological vein) if we can make progress by considering the nature of the involved coefficient. Such questions inevitably takes us in the direction of chemistry, where the mechanics of mixtures and solvents tends to be explored in detail. The chemistry lab may seem a strange place to look for answers to astrophysics questions, but the problems we are interested in are truly interdisciplinary so it is perhaps not surprising that this is where we end up.

Central to any discussion of this kind is the Onsager symmetry principle (Onsager 1931), see Andersson and Comer (2006) and Haskell et al. (2012) for relevant discussions. Focussing on the general idea—which is natural since the details depend on the application under consideration—we start by noting that, for any system perturbations of the entropy density s away from equilibrium must be represented by quadratic deviations. This allows us to write

$$\begin{aligned} s\approx s_{\mathrm {eq}}-\frac{\varDelta t}{2 T}\sum _{a,b} X_a L^{ab} X_b, \end{aligned}$$
(16.66)

or, making use of the entropy creation rate \(\varGamma _\mathrm {s}\):

$$\begin{aligned} T \varGamma _\mathrm {s}= -\frac{1}{2}\sum _{a,b} X_a L^{ab} X_b=\sum _{a=1}^{N} J^a X_a, \end{aligned}$$
(16.67)

where the \(X_a\) are known as “thermodynamic forces”. They represent a measure of the departure of the system from equilibrium, while the “thermodynamic fluxes”

$$\begin{aligned} J^a=-{1\over 2}\sum _b L^{ab}X_b, \end{aligned}$$
(16.68)

represent the response of the system. The Onsager symmetry principle simply states that microscopic reversibility implies that we should have \(L^{ab}=L^{ba}\). Comparing Eq. (16.67) to results like Eq. (16.61) we can, by constructing the most general form for the tensor \(L^{ab}\) in terms of the thermodynamical forces in the model, obtain the most general description of the dissipative terms in the equations of motion.

A key part of this construction is the observation that—because we are assuming an expansion away from equilibrium—we need the forces to vanish as thermodynamic equilibrium is reached. Hence, we should not work with the chemical potentials, as in (16.62), because they obviously do not vanish in equilibrium. This point comes to the fore when we consider problems with reactions, as in the case of bulk viscosity. We need to replace the chemical potential with a more suitable “force”. This leads us to introduce the affinity (Kondepudi and Prigogine 2005). In the context of neutron stars, this point has been made in Carter and Chamel (2005b) and Haskell et al. (2012).

Suppose there are N total reactions among M various constituents \(\mathrm {x}\) of our multi-fluid system, to be characterized in the usual way as stoichiometric relations between the particle number densitiesFootnote 41\(\nu ^\mathrm {x}=n^\mathrm {x}/\left( \sum _\mathrm {x}n^\mathrm {x}\right) \); i.e.

$$\begin{aligned} \sum _{\mathrm {x}}^M \mathrm{R}_\mathrm {x}^I~\nu ^\mathrm {x}\rightarrow \sum _{\mathrm {x}}^M \mathrm{P}_\mathrm {x}^I~\nu ^\mathrm {x}, \quad I = 1,\ldots ,N, \end{aligned}$$
(16.69)

where \(\mathrm{R}_\mathrm {x}^I\) and \(\mathrm{P}_\mathrm {x}^I\) are, respectively, the reactant and product stoichiometric coefficients. The affinity \(A^I\) of the \(I^\mathrm{th}\) reaction is then defined as

$$\begin{aligned} A^I \equiv \sum _{\mathrm {x}}^M \left( \mathrm{R}_\mathrm {x}^I - \mathrm{P}_\mathrm {x}^I\right) {\mu ^\mathrm {x}} . \end{aligned}$$
(16.70)

At thermodynamic equilibrium the affinities vanish, which is why they make appropriate thermodynamic forces.

It is intuitively clear that the affinities provide a natural description of the problem, but this does not mean that the formulation is complete at this point. In particular, it is worth noting that the chemical potentials \(\mu ^\mathrm {x}\) become somewhat ambiguous in a multi-fluid context. Each chemical potential should be defined as the energy per particle in the reference frame where the chemical (or nuclear) reactions occur, but a multi-fluid mixture is characterized by the presence of distinct velocity fields, neither of which represents the required frame. The relevant frame may, in fact, not be known a priori as the formulation we consider assumes an expansion away from “equilibrium”, which ultimately involves both dynamical and chemical considerations. The equilibrium frame may well depend on the dynamical evolution of the whole system. This complicates the issue, at least from the formal point of view (see Celora et al. 2021 for a detailed analysis).

According to Hess’s Law, for each chemical reaction there is only one thermodynamic variable to track in order to determine the changes; namely, the “degree of advancement” \(\xi _I\) for the various reactants. For each of the \(I = 1\ldots N\) reactions, a variation \(\varDelta \xi _I\) corresponds to a variation \(\varDelta \nu ^\mathrm {x}_I\) of the participating fluids:

$$\begin{aligned} \frac{\varDelta \nu _I^\mathrm{r}}{\mathrm{R}_\mathrm{r}^I} = \cdots = \frac{\varDelta \nu _I^\mathrm{s}}{\mathrm{R}_\mathrm{s}^I} = - \frac{\varDelta \nu _I^\mathrm{u}}{ \mathrm{P}_\mathrm{u}^I} = \cdots = - \frac{\varDelta \nu _I^\mathrm{v}}{ \mathrm{P}_\mathrm{v}^I} = \varDelta \xi _I , \end{aligned}$$
(16.71)

where \(\mathrm{r},\ldots ,\mathrm{s}\) and \(\mathrm{u},\ldots ,\mathrm{v}\) represent the \(\mathrm {x}\)-components for which the \(\mathrm{R}_\mathrm {x}^I\) and \(\mathrm{P}_\mathrm {x}^I\) are non-zero. The (irreversible) change \(\varDelta s\) in the entropy due to these reactions is given by

$$\begin{aligned} \varDelta s = \frac{1}{T} \sum _{I = 1}^N A^I \varDelta \xi _I . \end{aligned}$$
(16.72)

By comparing with Eq. (16.66), we see that the \(\varDelta \xi _I\) represent the appropriate thermodynamic “fluxes”.

The variations \(\varDelta \nu ^\mathrm {x}\) of the individual number densities, in some time interval \(\varDelta t\), can also be determined by

$$\begin{aligned} \varDelta \nu ^\mathrm {x}= \varGamma _\mathrm {x}\varDelta t , \end{aligned}$$
(16.73)

where \(\varGamma _\mathrm {x}\) is the particle number creation rate.

Each of the N reactions then has a corresponding change \(\nu ^\mathrm {x}_I\) that contributes to \(\varDelta \nu ^\mathrm {x}\), with the net result (as \(\varDelta t \rightarrow 0\))

$$\begin{aligned} \frac{d \nu ^\mathrm {x}}{d t} = \sum _I \left( \mathrm{R}_\mathrm {x}^I - \mathrm{P}_\mathrm {x}^I\right) \frac{d \xi _I}{d t} . \end{aligned}$$
(16.74)

Hence,

$$\begin{aligned} \varGamma _\mathrm {x}= \sum _I \left( \mathrm{R}_\mathrm {x}^I - \mathrm{P}_\mathrm {x}^I\right) \frac{d \xi _I}{d t} . \end{aligned}$$
(16.75)

If we take the reaction “velocity” \(V^I\equiv \frac{d \xi _I}{d t}\) to be the thermodynamical flux, then the change in entropy due to the reactions is

$$\begin{aligned} \varDelta s =\sum _{\mathrm {x}\ne \mathrm {s}} \mu ^\mathrm {x}\varGamma _\mathrm {x}= \sum _{\mathrm {x}\ne \mathrm {s}} {\mu ^\mathrm {x}} \left[ \sum _I \left( \mathrm{R}_\mathrm {x}^I - \mathrm{P}_\mathrm {x}^I\right) \frac{d \xi _I}{d t}\right] = \sum _{I} A^I V_I . \end{aligned}$$
(16.76)

In the general framework the corresponding thermodynamic force will then be \(A^I\) while the flux is \(-V_I\). Given this, we can construct the fluxes out of the forces, limiting ourselves to quadratic terms. An explicit example of such a construction can be found in Haskell et al. (2012).

16.8 Towards a dissipative action principle

Conventional wisdom suggests that an action principle—expressed as an integral of a Lagrangian, whose local extrema satisfy the equations of motion, subject to well-posed boundary constraints, see Sect. 4—cannot exist for a dissipative system. However, this may be too dismissive. There have been a number of (more or less successful) attempts to make progress on building dissipative variational models. A common approach has been to combine a variational model for the non-dissipative aspects with an argument that constrains the entropy production, often involving Lagrange multipliers (see Ichiyanagi 1994 for a review and Djukic and Vujanovic 1975; Djukic and Strauss 1980; Mobbs 1982; Kobe et al. 1986; Vujanovic et al. 1986; Honein et al. 1991; Chien et al. 1996; Nordbrock and Kienzler 2007; Fukagawa and Fujitani 2012 for samples of the literature). The model we will consider is conceptually different. The conservative constraints on the system are built into the variation itself and the model does not involve (at least not in the first instance) an expansion away from equilibrium (in contrast to, for example, the model of Israel and Stewart or, indeed, any model that takes a derivative expansion as its starting point). Formally, the new description remains valid also for systems far away from equilibrium, and hence it provides a promising framework for the exploration of nonlinear thermodynamical evolution and associated irreversible phenomena—a problem area where a number of challenging issues remain to be resolved, involving for example maximum versus minimum entropy production for non-equilibrium systems (Jaynes 1980; Dewar 2003; Martyushev and Seleznev 2006; di Vita 2010).

Why should we expect a variational argument for non-equilibrium systems to exist? The question is multi-faceted, but recall that one of the most topical problems in gravitational physics involves two stars (or black holes) in a binary system, that lose orbital energy through the emission of gravitational waves. Gravitational-wave emission is a dissipative mechanism, yet the underlying theory is obtained from an action (see Sect. 4.4). This tells us that you can, indeed, use a variational strategy for dissipative problems (a similar argument was made by Galley 2013; Galley et al. 2014). The key insight is that all the energy in the system must be accounted for. In many ways this is trivial. If you account for all the energy in a given system, including the “heat bath”, then there is no dissipation as such. Rather, one tries to model the redistribution of energy within the larger (now closed) system. This may be a natural logical argument, but the question is if we can turn it into a practical proposition.

The first step in this direction involves designing a variational argument that leads to the functional form of the dissipative fluid equations, adopting the attitude from classical mechanics where the equations of motion for a system can be written down without actual reference to a particular form for the energy. The completion of the model—fully specifying the various coefficients involved, which must draw on some level of microphysics understanding—is, of course, important but the problem is sufficiently complex that it is sensible to progress in manageable steps.

The idea behind the new approach is, conceptually, quite simple (Andersson and Comer 2015). Recalling that the individual matter spaces (associated with the various fluid components) play a central role in the variational construction for a conservative system, let us consider the “physics” of a dissipative system, e.g., with resistivity, shear or bulk viscosity. On the micro-scale dissipation arises due to particle interactions/reactions. On the fluid scale this naturally translates into an interaction between the matter spaces. This interaction can be accounted for by letting each matter space be endowed with a volume form which depends on:

  1. 1.

    the coordinates of all the matter spaces, and

  2. 2.

    the independent mappings of the spacetime metric into these spaces.

For example, if each \(n^{\mathrm {x}}_{A B C}\) is no longer just a function of its own \(X^A_{\mathrm {x}}\), the closure of \(n^{\mathrm {x}}_{a b c}\) will be broken. As the fluxes are no longer conserved, the formalism incorporates dissipation. Simple!

To see how this works, let us revisit the conservative problem from Sect. 9. Recall that the scalar fields \(X^A_{\mathrm {x}}\) label the (fluid) particles. If these are conserved, then the \(X^A_{\mathrm {x}}\) must be constant along the relevant worldlines. That this is, indeed, the case is easy to demonstrate. Letting \(\tau _{\mathrm {x}}\) be the proper time of each worldline, we have

$$\begin{aligned} \frac{{ d} X^A_{\mathrm {x}}}{{ d} \tau _{\mathrm {x}}} = u^a_{\mathrm {x}}\frac{\partial X^A_{\mathrm {x}}}{\partial x^a} = \frac{1}{n_{\mathrm {x}}} n^{\mathrm {x}}_{B C D} \epsilon ^{a b c d} \frac{\partial X^A_{\mathrm {x}}}{\partial x^{a}} \frac{\partial X^B_{\mathrm {x}}}{\partial x^b} \frac{\partial X^C_{\mathrm {x}}}{\partial x^c} \frac{\partial X^D_{\mathrm {x}}}{\partial x^{d}} =0 . \end{aligned}$$
(16.77)

Since a fluid element’s matter space coordinates \(X^A_{\mathrm {x}}\) are constant along its worldline, it must also be the case that

$$\begin{aligned} \frac{{d} n^{\mathrm {x}}_{A B C}}{{d} \tau _{\mathrm {x}}}= 0 . \end{aligned}$$
(16.78)

In other words, the volume form \(n^{\mathrm {x}}_{ABC}\) is fixed in the associated matter space. These steps demonstrate that the key to non-conservation is to allow \(n^{\mathrm {x}}_{A B C}\) to be a function of more than the \(X^A_{\mathrm {x}}\). This is quite intuitive. The worldlines of the various fluids will in general cut across each other, leading to interactions/reactions. A more general functional form for the matter space volume forms \(n_{ABC}^{\mathrm {x}}\) may then be used to reflect this aspect of the physics. A schematic illustration of how this works is provided in Fig. 17.

Fig. 17
figure 17

An illustration of the notion that a coupling between matter spaces may lead to dissipation. We consider the case of two fluids, labelled r and b (red and blue). The individual \(X^A_{\mathrm {x}}\) do not vary along their own worldlines, even when the system is dissipative. By adding \(X^A_{\mathrm {y}}\) (\({\mathrm {y}}\ne {\mathrm {x}}\)) we get “evolution” since the worldlines cut across each other. Let us choose a particular worldline of the r-fluid, say \(X^A_\mathrm {r,0}\), meaning that \(X^A_\mathrm {r}\) will take the same value at each spacetime point \(x^a\)along the worldline. At an intersection with a worldline of a fluid element of the b-fluid (the point labelled 1 in the figure, say) the other fluid’s worldline will have its own label (in this case \(X^A_\mathrm {b,1}\)), which is the same at every point on that worldline. At the next intersection (point 2), the worldline we are following has the same value for \(X^A_\mathrm {r}\), but it is intersected by a different worldline from the other fluid (\(X^A_\mathrm {b,2}\)), meaning that \(X^A_\mathrm {b}\) at each intersection is different. Hence, \(X^A_\mathrm {b}\), when considered as a field in spacetime, must vary along the r-fluid worldlines, and vice versa. This is how the closure of the individual volume three-forms is broken and ultimately why the model is dissipative

The seemingly simple step of enlarging the functional dependence of \(n^{\mathrm {x}}_{ABC}\) allows us to build a variational model that incorporates a number of dissipative terms. However, in doing this we have to tread carefully. In particular, we must pay closer attention to the various matter space objects. We are now dealing with geometric objects that actually live in the higher-dimensional combination of all the matter spaces, e.g., we are dealing with an object of the form

$$\begin{aligned} n^{\mathrm {x}}_{ABC} \left( X_{\mathrm {x}}^D, X_{\mathrm {y}}^E \right) dX_{\mathrm {x}}^A \wedge dX_{\mathrm {x}}^B \wedge dX_{\mathrm {x}}^C , \qquad {\mathrm {y}}\ne {\mathrm {x}}. \end{aligned}$$
(16.79)

That is, a volume form in the x-matter space parameterized by points in the y-matter spaces. We can still pretend that the individual matter spaces (related to spacetime via the same maps as in the conserved case) remain somehow “distinct”, but in reality this is not the case.

When we allow \(n^{\mathrm {x}}_{ABC}\) to be more complex we (inevitably) break some of the attractive features of the conservative model. Obviously, \(n^{\mathrm {x}}_{ABC}\) is no longer a fixed matter space object. This has a number of repercussions, but we can still construct the action from matter space objects. To do this we need the map of the spacetime metric into the relevant matter space (as in the case of elasticity, see Sect. 12)

$$\begin{aligned} g^{A_{\mathrm {x}}B_{\mathrm {x}}}= \frac{\partial X^A_{\mathrm {x}}}{\partial x^a} \frac{\partial X^B_{\mathrm {x}}}{\partial x^b} g^{a b} = g^{B_{\mathrm {x}}A_{\mathrm {x}}}. \end{aligned}$$
(16.80)

Note that \(g^{A_{\mathrm {x}}B_{\mathrm {x}}}\) is not likely to be a tensor on matter space. In order for that to be the case, the corresponding spacetime tensor must satisfy two conditions: First, it must be flowline orthogonal (on each index). This is true here since the operator which generates projections orthogonal to \({\mathrm {x}}\)-fluid worldlines is

$$\begin{aligned} \perp _{\mathrm {x}}^{ab} = g^{ab} + u_{\mathrm {x}}^a u_{\mathrm {x}}^b , \end{aligned}$$
(16.81)

and because of Eq. (16.77) we have

$$\begin{aligned} g^{A_{\mathrm {x}}B_{\mathrm {x}}}= \frac{\partial X^A_{\mathrm {x}}}{\partial x^a} \frac{\partial X^B_{\mathrm {x}}}{\partial x^b} g^{a b} = \frac{\partial X^A_{\mathrm {x}}}{\partial x^a} \frac{\partial X^B_{\mathrm {x}}}{\partial x^b} \perp _{\mathrm {x}}^{a b} . \end{aligned}$$
(16.82)

The second condition that \(\perp ^{ab}_{\mathrm {x}}\) must satisfy so that \(g^{A_{\mathrm {x}}B_{\mathrm {x}}}\) is a matter space tensor is (Beig and Schmidt 2003a)

$$\begin{aligned} \mathcal {L}_{u_{\mathrm {x}}} \perp ^{ab}_{\mathrm {x}}= 0 . \end{aligned}$$
(16.83)

This is not the case here. Indeed, this condition is too severe for most relevant applications.

Anyway, it is easy to show that a scalar constructed from the contraction involving \(g^{ab}\) and some tensor \(t^{\mathrm {x}}_{a\ldots }\) is identical to the analogous contraction of the corresponding matter space objects (Karlovini and Samuelsson 2003). In particular, the number density follows from (as before)

$$\begin{aligned} n_{\mathrm {x}}^2= & {} - g_{ab} n_{\mathrm {x}}^a n_{\mathrm {x}}^b = \frac{1}{3!} g^{ad} g^{be} g^{cf} n^{\mathrm {x}}_{abc} n^{\mathrm {x}}_{def} \nonumber \\= & {} \frac{1}{3!} g_{\mathrm {x}}^{AD} g_{\mathrm {x}}^{BE} g_\mathrm {x}^{CF} n^{\mathrm {x}}_{ABC} n^{\mathrm {x}}_{DEF} , \end{aligned}$$
(16.84)

while the chemical potential

$$\begin{aligned} \mu ^{\mathrm {x}}= - u_{\mathrm {x}}^a \mu ^{\mathrm {x}}_a \end{aligned}$$
(16.85)

(according to an observer at rest in the respective fluid’s frame) can be obtained from

$$\begin{aligned} n_{\mathrm {x}}\mu ^{\mathrm {x}}= - n_{\mathrm {x}}^a \mu ^{\mathrm {x}}_a = \frac{1}{3!} \mu _{\mathrm {x}}^{abc} n^{\mathrm {x}}_{abc} = \frac{1}{3!} \mu _{\mathrm {x}}^{ABC} n^{\mathrm {x}}_{ABC} . \end{aligned}$$
(16.86)

Here we have (as in Sect. 10) introduced the dual to the momentum \(\mu ^{\mathrm {x}}_a\):

$$\begin{aligned} \mu _{\mathrm {x}}^{a b c} = \epsilon ^{d a b c} \mu _d^{\mathrm {x}}, \quad \mu _a^{\mathrm {x}}= \frac{1}{3!} \epsilon _{b c d a} \mu _{\mathrm {x}}^{b c d} , \end{aligned}$$
(16.87)

and its matter space image;

$$\begin{aligned} \mu ^{A B C}_{\mathrm {x}}= \frac{\partial X^A_{\mathrm {x}}}{\partial x^{[a}} \frac{\partial X^B_{\mathrm {x}}}{\partial x^b} \frac{\partial X^C_{\mathrm {x}}}{\partial x^{c]}} \mu _{\mathrm {x}}^{a b c} . \end{aligned}$$
(16.88)

The key take-home message is that we can think of the matter action as being constructed entirely from matter space quantities. In the simplest case of a single component one would have (see Sect. 6)

$$\begin{aligned} \varLambda \left( n_{\mathrm {x}}\right) = \varLambda \left( n^{\mathrm {x}}_{abc}, g^{ab}\right) \Leftrightarrow \varLambda \left( n^{\mathrm {x}}_{ABC}, g_{\mathrm {x}}^{AB} \right) . \end{aligned}$$
(16.89)

16.9 A reactive/resistive example

Let us try to make the idea more concrete by working through the steps of the variational analysis, while allowing for general variations of the matter space density. Since the matter space coordinates still vary according to (6.14) (this is essentially just the definition of the Lagrangian displacement) we easily arrive at the generic variation

$$\begin{aligned} \delta n^{\mathrm {x}}_{a b c} = - \mathcal{L}_{\xi _{\mathrm {x}}}n^{\mathrm {x}}_{a b c} + \frac{\partial X^A_{\mathrm {x}}}{\partial x^{[a}} \frac{\partial X^B_{\mathrm {x}}}{\partial x^b} \frac{\partial X^C_{\mathrm {x}}}{\partial x^{c]}} \varDelta _{\mathrm {x}}n^{\mathrm {x}}_{A B C} . \end{aligned}$$
(16.90)

To make contact with (6.21) we need

$$\begin{aligned} \mu ^{\mathrm {x}}_a \delta n^a_{\mathrm {x}}= \frac{1}{3!} \mu ^{\mathrm {x}}_a \delta \left( \epsilon ^{bcda} n^{\mathrm {x}}_{bcd} \right) = - \frac{1}{3!} \mu _{\mathrm {x}}^{bcd} \delta n^{\mathrm {x}}_{bcd} + \frac{1}{3!} \mu ^{\mathrm {x}}_a n^{\mathrm {x}}_{bcd} \delta \epsilon ^{bcda} , \end{aligned}$$
(16.91)

where we recall (6.20). Hence, we arrive at

$$\begin{aligned} \mu ^{\mathrm {x}}_a \delta n^a_{\mathrm {x}}= \frac{1}{3!} \mu ^{a b c}_{\mathrm {x}}\mathcal{L}_{\xi _{\mathrm {x}}}n^{\mathrm {x}}_{a b c} - \frac{1}{2} \mu ^{\mathrm {x}}_a n^a_{\mathrm {x}}g^{bc} \delta g_{bc} - \frac{1}{3!} \mu ^{A B C}_{\mathrm {x}}\varDelta _{\mathrm {x}}n^{\mathrm {x}}_{A B C} , \end{aligned}$$
(16.92)

and the “final” expression:

$$\begin{aligned} \mu ^{\mathrm {x}}_a \delta n^a_{\mathrm {x}}= & {} \mu ^{\mathrm {x}}_a \left( n^b_{\mathrm {x}}\nabla _b \xi ^a_{\mathrm {x}}- \xi ^b_{\mathrm {x}}\nabla _b n^a_{\mathrm {x}}- n^a_{\mathrm {x}}\nabla _b \xi ^b_{\mathrm {x}}- \frac{1}{2} n^a_{\mathrm {x}}g^{bc} \delta g_{bc} \right) \nonumber \\&- \frac{1}{3!} \mu ^{A B C}_{\mathrm {x}}\varDelta _{\mathrm {x}}n^{\mathrm {x}}_{A B C} . \end{aligned}$$
(16.93)

The terms in the bracket are the same as in the conservative case, cf. (6.21). The last term is new.

The functional dependence of the volume form for a given fluid’s matter space is now the main input. Obviously, \(n^{\mathrm {x}}_{A B C}\) must depend on \(X^A_{\mathrm {x}}\), the coordinates of the corresponding matter space, in order for us to retain the conservative dynamics. Adding to this, let us include the coordinates \(X^A_{\mathrm {y}}\) from the other, \({\mathrm {y}}\ne {\mathrm {x}}\), matter spaces. This breaks the closure of \(n^{\mathrm {x}}_{a b c}\) and the model is no longer conservative.

The required variation of \(n^{\mathrm {x}}_{A B C}\) becomes [in view of (6.13)]

$$\begin{aligned} \varDelta _{\mathrm {x}}n^{\mathrm {x}}_{A B C} = \sum _{{\mathrm {y}}\ne {\mathrm {x}}} \frac{\partial n^{\mathrm {x}}_{A B C}}{\partial X^D_{\mathrm {y}}} \varDelta _{\mathrm {x}}X^D_{\mathrm {y}}= \sum _{{\mathrm {y}}\ne {\mathrm {x}}} \frac{\partial n^{\mathrm {x}}_{A B C}}{\partial X^D_{\mathrm {y}}} \left( \xi _{\mathrm {x}}^a-\xi _{\mathrm {y}}^a\right) \partial _a X^D_{\mathrm {y}}. \end{aligned}$$
(16.94)

Comparing to (16.92), we see that it is natural to define

$$\begin{aligned} R^{{\mathrm {x}}{\mathrm {y}}}_a\equiv \frac{1}{3!} \mu ^{ABC}_{\mathrm {x}}\frac{\partial n^{\mathrm {x}}_{ABC}}{\partial X^D_{\mathrm {y}}} \partial _a X^D_{\mathrm {y}}. \end{aligned}$$
(16.95)

We then have

$$\begin{aligned} \mu ^{\mathrm {x}}_a \delta n^a_{\mathrm {x}}= & {} \mu ^{\mathrm {x}}_a \left( n^b_{\mathrm {x}}\nabla _b \xi ^a_{\mathrm {x}}- \xi ^b_{\mathrm {x}}\nabla _b n^a_{\mathrm {x}}- n^a_{\mathrm {x}}\nabla _b \xi ^b_{\mathrm {x}}- \frac{1}{2} n^a_{\mathrm {x}}g^{bc} \delta g_{bc} \right) \nonumber \\&+ \sum _{{\mathrm {y}}\ne {\mathrm {x}}} R^{{\mathrm {x}}{\mathrm {y}}}_a\left( \xi _{\mathrm {y}}^a - \xi _{\mathrm {x}}^a \right) . \end{aligned}$$
(16.96)

The final step involves writing down the variation of the matter Lagrangian, \(\varLambda \). Starting from (6.1), we arrive at

$$\begin{aligned}&\delta \left( \sqrt{- g} \varLambda \right) \nonumber \\&\quad = - \sqrt{- g} \left\{ \sum _{{\mathrm {x}}} \left( f^{\mathrm {x}}_a +\mu ^{\mathrm {x}}_a \varGamma _{\mathrm {x}}- R^{{\mathrm {x}}}_a\right) \xi ^a_{\mathrm {x}}- \frac{1}{2} \left( \varPsi g^{a b} + \sum _{{\mathrm {x}}} n^a_{\mathrm {x}}\mu ^b_{\mathrm {x}}\right) \delta g_{a b}\right\} \nonumber \\&\qquad + \nabla _a \left( \frac{1}{2} \sqrt{-g} \sum _{{\mathrm {x}}} \mu ^{a b c}_{\mathrm {x}}n^{\mathrm {x}}_{b c d} \xi ^d_{\mathrm {x}}\right) , \end{aligned}$$
(16.97)

where we have used

$$\begin{aligned} \sum _{{\mathrm {x}}} \sum _{{\mathrm {y}}\ne {\mathrm {x}}} R^{{\mathrm {x}}{\mathrm {y}}}_a\xi ^a_{\mathrm {y}}= \sum _{{\mathrm {x}}} \sum _{{\mathrm {y}}\ne {\mathrm {x}}} R^{{\mathrm {y}}{\mathrm {x}}}_a\xi ^a_{\mathrm {x}}. \end{aligned}$$
(16.98)

We have also defined

$$\begin{aligned} R^{{\mathrm {x}}}_a = \sum _{{\mathrm {y}}\ne {\mathrm {x}}} \left( R^{{\mathrm {y}}{\mathrm {x}}}_a - R^{{\mathrm {x}}{\mathrm {y}}}_a \right) , \end{aligned}$$
(16.99)

and

$$\begin{aligned} \varGamma _{\mathrm {x}}= \nabla _a n^a_{\mathrm {x}}. \end{aligned}$$
(16.100)

Hence, the individual components are governed by the equations of motion

$$\begin{aligned} f^{\mathrm {x}}_a + \varGamma _{\mathrm {x}}\mu ^{\mathrm {x}}_a = n^b_{\mathrm {x}}\omega ^{\mathrm {x}}_{b a} + \varGamma _{\mathrm {x}}\mu ^{\mathrm {x}}_a = R^{{\mathrm {x}}}_a . \end{aligned}$$
(16.101)

Since the force term \(f^{\mathrm {x}}_a\) on the left-hand side is orthogonal to \(n_{\mathrm {x}}^a\) (by the anti-symmetry of \(\omega ^{\mathrm {x}}_{a b}\)) it is easy to see that this result implies that the particle creation/destruction rates are given by

$$\begin{aligned} \varGamma _{\mathrm {x}}= - \frac{1}{\mu ^{\mathrm {x}}} u_{\mathrm {x}}^a R^{{\mathrm {x}}}_a . \end{aligned}$$
(16.102)

Finally, an orthogonal projection of (16.101) leads to

$$\begin{aligned} 2 n^a_\mathrm {x}\nabla _{[a} \mu ^{\mathrm {x}}_{b]} +\varGamma _{\mathrm {x}}\perp _{{\mathrm {x}}b}^a \mu ^{\mathrm {x}}_a = \perp _{{\mathrm {x}}b}^a R^{{\mathrm {x}}}_a , \end{aligned}$$
(16.103)

which provides the dissipative equations of motion for the system.

The bottomline is that, with Eq. (16.97) we have a true action principle—in the sense that the field equations are extrema of the action—for a system of fluids that includes dissipation. It is also worth noting that the stress-energy tensor is still given by

$$\begin{aligned} T^{a}{}_{b} = \varPsi \delta ^a{}_b + \sum _{\mathrm {x}} n^a_\mathrm {x}\mu ^\mathrm {x}_b , \end{aligned}$$
(16.104)

and we have

$$\begin{aligned} \nabla _b T^b{}_a = \sum _{{\mathrm {x}}}\left( f^{\mathrm {x}}_a + \mu _a^{\mathrm {x}}\varGamma _{\mathrm {x}}\right) = 0 , \end{aligned}$$
(16.105)

since

$$\begin{aligned} \sum _{{\mathrm {x}}}R_a^{\mathrm {x}}= 0 . \end{aligned}$$
(16.106)

The requirement that the divergence of the stress-energy tensor vanish is automatically guaranteed by the dissipative fluid equations, in keeping with the diffeomorphism invariance of the theory.

As an immediate application of these relations, connecting with the discussion in Sect. 15, let us consider the simplest relevant setting. Assume that we consider a system with two components; matter (labelled \(\mathrm {n}\)) and heat, represented by the entropy (labelled \(\mathrm {s}\)). In principle, we need to provide an equation of state (that satisfies relevant physics constraints) in order to complete the model. Once this is provided we can calculate the resistivity coefficients from (16.95) and then model the system using the momentum equations (16.101). However, let us consider the problem at the level of phenomenology. We assume that the matter component is conserved, but the entropy does not need to be.

First of all, given that we only have two components we must have

$$\begin{aligned} R^{\mathrm {n}}_a = R^{\mathrm {s}\mathrm {n}}_a - R^{\mathrm {n}\mathrm {s}}_a = - R^\mathrm {s}_a . \end{aligned}$$
(16.107)

Secondly, the conservation of the material component implies that

$$\begin{aligned} \varGamma _\mathrm {n}= - \frac{1}{\mu ^\mathrm {n}} u_\mathrm {n}^a R^{\mathrm {n}}_a = \frac{1}{\mu ^\mathrm {n}} u_\mathrm {n}^a R^{\mathrm {n}\mathrm {s}}_a = 0 \quad \Longrightarrow \quad u_\mathrm {n}^a R^{\mathrm {n}\mathrm {s}}_a = 0 . \end{aligned}$$
(16.108)

The upshot is that \(R^{\mathrm {n}\mathrm {s}}_a\) must be orthogonal to both \(u_\mathrm {n}^a\) and \(u_\mathrm {s}^a\). Meanwhile, the entropy change is constrained by the second law. That is, we have

$$\begin{aligned} \varGamma _\mathrm {s}= - \frac{1}{T} u_\mathrm {s}^a R^{\mathrm {s}}_a = \frac{1}{T} u_\mathrm {s}^a R^{\mathrm {s}\mathrm {n}}_a \ge 0 , \end{aligned}$$
(16.109)

where we have introduced the temperature \(T = \mu ^\mathrm {s}\). Note that the constraints affect the two, likely independent, contributions to \(R^\mathrm {n}_a\). We cannot infer a link between \(R^{\mathrm {n}\mathrm {s}}_a\) and \(R^{\mathrm {s}\mathrm {n}}_a\) at this point.

So far we have not introduced a privileged observer. In order to facilitate a comparison with the discussion in Sect. 15, let us focus on an observer moving along with the matter flow. Then we have \(u^a=u_\mathrm {n}^a\) and the relative flow required to express the entropy flux is defined such that

$$\begin{aligned} u_\mathrm {s}^a = \gamma \left( u^a + w^a \right) , \end{aligned}$$
(16.110)

where

$$\begin{aligned} u^a w_a = 0, \quad \text{ and } \quad \gamma = \left( 1 - w^2 \right) ^{-1/2} . \end{aligned}$$
(16.111)

The relative velocity \(w^a\) is aligned with the heat flux vector (see, for example, Eq. (15.40)).

Given (16.108) and (16.109) it makes sense to introduce the decompositions

$$\begin{aligned} R^{\mathrm {n}\mathrm {s}}_a = \epsilon _{abcd} \phi _\mathrm {n}^b u^c w^d , \end{aligned}$$
(16.112)

and

$$\begin{aligned} R^{\mathrm {s}\mathrm {n}}_a = R_w w_a + \epsilon _{abcd} \phi _\mathrm {s}^b u^c w^d , \end{aligned}$$
(16.113)

where \(\phi _\mathrm {n}^a\) and \(\phi _\mathrm {s}^a\) are unspecified vector fields. We then see that (16.109) leads to

$$\begin{aligned} T \varGamma _\mathrm {s}= \gamma R_w w^2 \ge 0 \quad \longrightarrow \quad R_w > 0 . \end{aligned}$$
(16.114)

Meanwhile, the two components \(\phi _\mathrm {n}^a\) and \(\phi _\mathrm {s}^a\) are not constrained by the thermodynamics. This leaves a degree of arbitrariness in the model. Should we be surprised by this? Probably not. A similar issue was discussed by Lopez-Monsalvo and Andersson (2011) where it was demonstrated that the variational derivation leads to the presence of a number of terms in the heat equation that cannot be constrained by the second law. It was also pointed out that the difference between the model advocated by Lopez-Monsalvo and Andersson (2011) and the second-order model of Israel and Stewart appeared at this level (Priou 1991). It has not been established whether there are situations where these terms have a notable effect on the dynamics. This may be an interesting question.

16.10 Adding dissipative stresses

The previous example demonstrates how dissipation can be included in the variational multi-fluid formalism. This is a positive step towards a better understanding of non-equilibrium systems in General Relativity. Dissipative contributions that tend to be postulated can now be derived from first principles. Moreover, as the comparison with the problem of heat flow demonstrates, the model introduces new aspects of the problem. However, the example we provided only accounts for two particular non-equilibrium phenomena, particle non-conservation and resistivity. In order to argue that the model represents a credible alternative to established strategies, we need to demonstrate that the action principle generates terms of the tensorial form expected for more general processes. Thus, we consider the issue of dissipative stresses.

The obvious starting point for an extension of the strategy is to ask what other quantities the matter space volume form, \(n^{\mathrm {x}}_{ABC}\), may depend on. The natural object to consider is the mapping of the spacetime metric, \(g_{a b}\), into the respective matter spaces. As we will now demonstrate, this leads to a description that accounts for dissipative shear stresses.

The mapping of the metric into the matter spaces introduces three independent possibilities. The most intuitive option involves allowing \(n^{\mathrm {x}}_{ABC}\) to depend on \(g^{A_{\mathrm {x}}B_{\mathrm {x}}}\), as defined in (16.80). Noting that Eq. (16.77) implies that the \(X^A_{\mathrm {x}}\) will still be conserved along the associated flow, the variation of \(n^{\mathrm {x}}_{A B C}\) is then such that

$$\begin{aligned} \varDelta _{\mathrm {x}}n^{\mathrm {x}}_{A B C} = \frac{\partial n^{\mathrm {x}}_{A B C}}{\partial g^{D_{\mathrm {x}}E_{\mathrm {x}}}} \varDelta _{\mathrm {x}}g^{D_{\mathrm {x}}E_{\mathrm {x}}}+ \sum _{{\mathrm {y}}\ne {\mathrm {x}}} \frac{\partial n^{\mathrm {x}}_{A B C}}{\partial X^D_{\mathrm {y}}} \varDelta _{\mathrm {x}}X^D_{\mathrm {y}}. \end{aligned}$$
(16.115)

The first term in this expression is new, the second term is the same as in (16.94) . The new term is easily worked out, following the steps from the simpler model. We find that

$$\begin{aligned} \varDelta _{\mathrm {x}}g^{A_{\mathrm {x}}B_{\mathrm {x}}}= \frac{\partial X^A_{\mathrm {x}}}{\partial x^a} \frac{\partial X^B_{\mathrm {x}}}{\partial x^b} \varDelta _{\mathrm {x}}g^{ab} = \frac{\partial X^A_{\mathrm {x}}}{\partial x^a} \frac{\partial X^B_{\mathrm {x}}}{\partial x^b} \left[ \delta g^{a b}- 2 \nabla ^{(a} \xi ^{b)}_{\mathrm {x}}\right] , \end{aligned}$$
(16.116)

where we have used

$$\begin{aligned} \varDelta _{\mathrm {x}}g^{ab} = \delta g^{a b}- 2 \nabla ^{(a} \xi ^{b)}_{\mathrm {x}}, \end{aligned}$$
(16.117)

(and round brackets indicate symmetrization, as usual.)

As in the previous example, the variation of the matter Lagrangian involves \(\mu ^{A B C}_{\mathrm {x}}\varDelta _{\mathrm {x}}n^{\mathrm {x}}_{A B C}\). The new contribution then takes the form

$$\begin{aligned}&\frac{1}{3!} \mu ^{A B C}_{\mathrm {x}}\frac{\partial n^{\mathrm {x}}_{A B C}}{\partial g^{D_{\mathrm {x}}E_{\mathrm {x}}}} \varDelta _{\mathrm {x}}g^{D_{\mathrm {x}}E_{\mathrm {x}}}\nonumber \\&\quad = \frac{1}{3!} \mu ^{A B C}_{\mathrm {x}}\frac{\partial n^{\mathrm {x}}_{A B C}}{\partial g^{D_{\mathrm {x}}E_{\mathrm {x}}}} \frac{\partial X^D_{\mathrm {x}}}{\partial x^a} \frac{\partial X^E_{\mathrm {x}}}{\partial x^b} \left[ \delta g^{a b} - 2 \nabla ^{(a} \xi ^{b)}_{\mathrm {x}}\right] \nonumber \\&\quad = - \frac{1}{2} {S}^{\mathrm {x}}_{a b}\left[ g^{ac} g^{bd} \delta g_{cd} + 2 \nabla ^{(a} \xi ^{b)}_{\mathrm {x}}\right] = - \frac{1}{2} {S}_{\mathrm {x}}^{a b} \delta g_{ab} - {S}^{\mathrm {x}}_{a b}\nabla ^b \xi _{\mathrm {x}}^a , \end{aligned}$$
(16.118)

where we have defined

$$\begin{aligned} {S}^{\mathrm {x}}_{a b}= \frac{1}{3} \mu ^{ABC}_{\mathrm {x}}\frac{\partial n^{\mathrm {x}}_{ABC}}{\partial g^{D_{\mathrm {x}}E_{\mathrm {x}}}} \frac{\partial X^D_{\mathrm {x}}}{\partial x^a} \frac{\partial X^E_{\mathrm {x}}}{\partial x^b} = {S}^{\mathrm {x}}_{b a}, \end{aligned}$$
(16.119)

such that

$$\begin{aligned} u_{\mathrm {x}}^a {S}^{\mathrm {x}}_{b a}= 0 . \end{aligned}$$
(16.120)

Combining the results, we arrive at

$$\begin{aligned} \mu ^{\mathrm {x}}_a \delta n^a_{\mathrm {x}}= & {} \mu ^{\mathrm {x}}_a \left( n^b_{\mathrm {x}}\nabla _b \xi ^a_{\mathrm {x}}- \xi ^b_{\mathrm {x}}\nabla _b n^a_{\mathrm {x}}- n^a_{\mathrm {x}}\nabla _b \xi ^b_{\mathrm {x}}\right) + {S}^{\mathrm {x}}_{a b}\nabla ^b \xi _{\mathrm {x}}^a \nonumber \\&+ \sum _{{\mathrm {y}}\ne {\mathrm {x}}} R^{{\mathrm {x}}{\mathrm {y}}}_a\left( \xi _{\mathrm {y}}^a - \xi _{\mathrm {x}}^a \right) + \frac{1}{2} \left[ \mu ^{\mathrm {x}}_c n^c_{\mathrm {x}}g^{ab} + {S}_{\mathrm {x}}^{a b} \right] \delta g_{ab} . \end{aligned}$$
(16.121)

Introducing the total dissipative stresses, in this case trivially setting

$$\begin{aligned} D^{\mathrm {x}}_{a b}= {S}^{\mathrm {x}}_{a b}, \end{aligned}$$
(16.122)

we see that Eq. (16.97) becomes

$$\begin{aligned} \delta \left( \sqrt{- g} \varLambda \right)= & {} - \sqrt{- g} \left\{ \sum _{{\mathrm {x}}} \left( f^{\mathrm {x}}_a + \varGamma _{\mathrm {x}}\mu ^{\mathrm {x}}_a + \nabla ^bD^{\mathrm {x}}_{b a}- R_a^{\mathrm {x}}\right) \xi ^a_{\mathrm {x}}\right. \nonumber \\&\left. - \frac{1}{2} \left[ \varPsi g^{a b} + \sum _{{\mathrm {x}}} \left( n^a_{\mathrm {x}}\mu ^b_{\mathrm {x}}+ D^{\mathrm {x}}_{a b}\right) \right] \delta g_{a b}\right\} \nonumber \\&+ \nabla _a \left[ \sqrt{-g} \sum _{{\mathrm {x}}} \left( \frac{1}{2} \mu ^{a b c}_{\mathrm {x}}n^{\mathrm {x}}_{b c d} + g^{ac} D^{\mathrm {x}}_{cd}\right) \xi ^d_{\mathrm {x}}\right] , \end{aligned}$$
(16.123)

where we have used (16.98) and (16.99) for the resistivity currents.

The equations of motion now take the form

$$\begin{aligned} f^{\mathrm {x}}_a + \varGamma _{\mathrm {x}}\mu ^{\mathrm {x}}_a + \nabla ^b D^{\mathrm {x}}_{a b}= R_a^{\mathrm {x}}, \end{aligned}$$
(16.124)

and the stress-energy tensor is

$$\begin{aligned} T^{a b} = \varPsi g^{a b} + \sum _{{\mathrm {x}}} \left( n^a_{\mathrm {x}}\mu ^b_{\mathrm {x}}+ D_{\mathrm {x}}^{a b}\right) , \end{aligned}$$
(16.125)

where the generalized pressure, \(\varPsi \), remains unchanged, cf. (9.17). As in the previous problem, it is easy to show that

$$\begin{aligned} \nabla _b T^b{}_a = \sum _{{\mathrm {x}}} \left( f^{\mathrm {x}}_a + \varGamma _{\mathrm {x}}\mu ^{\mathrm {x}}_a + \nabla ^b D^{\mathrm {x}}_{a b}\right) = 0 , \end{aligned}$$
(16.126)

since (16.106) still holds.

Finally, we can extract the various creation/destruction rates. We first contract Eq. (16.124) with \(u^a_{\mathrm {x}}\), noting that \(u^a_{\mathrm {x}}f^{\mathrm {x}}_a = 0\) and \(u^a_{\mathrm {x}}\nabla ^b D^{\mathrm {x}}_{a b}= - D^{\mathrm {x}}_{a b}\nabla ^b u^a_{\mathrm {x}}\), to find

$$\begin{aligned} \mu ^{\mathrm {x}}\varGamma _{\mathrm {x}}= - R_a^{\mathrm {x}}u^a_{\mathrm {x}}- D^{\mathrm {x}}_{a b}\nabla ^b u^a_{\mathrm {x}}. \end{aligned}$$
(16.127)

When \({\mathrm {x}}=\mathrm {s}\) this gives the entropy creation rate which should be constrained by the second law.

Armed with the more general constraint (16.127) for the dissipative terms, let us revisit the two-component model problem. In particular, let us ask what we can learn from the constraints that follow from the derivation. As in the previous discussion of this problem we will use an observer moving along with the matter flow, such that \(u^a = u_\mathrm {n}^a\) and \(w^a\) represents the relative flow.

Let us first consider the matter component. Since we know that \(R^{\mathrm {n}\mathrm {s}}\) should be orthogonal to \(u_\mathrm {s}^a\) we introduce the decomposition

$$\begin{aligned} R^{\mathrm {n}\mathrm {s}}_a = R_u \left( w^2 u_a + w_a \right) + \epsilon _{abcd} \phi _\mathrm {n}^b u^c w^d . \end{aligned}$$
(16.128)

Then (16.127) implies that

$$\begin{aligned} D^\mathrm {n}_{ab} \nabla ^b u^a = - R^{\mathrm {n}\mathrm {s}}_a u^a = R_u w^2 . \end{aligned}$$
(16.129)

Now, there are two possible cases to consider. In the general case, with a distinct heat flow, we have \(w^2>0\) which if we take \(R_u >0\) implies that the left-hand side of (16.129) must be positive. To ensure that this is the case, we use the standard decomposition (with the same conventions as before, see (14.11))

$$\begin{aligned} \nabla _a u^{\mathrm {x}}_b = \sigma ^{\mathrm {x}}_{ab} +\varpi ^{\mathrm {x}}_{ab} - u^{\mathrm {x}}_a \dot{u}^{\mathrm {x}}_b +\frac{1}{3} \theta ^{\mathrm {x}}\perp ^{\mathrm {x}}_{ab}, \end{aligned}$$
(16.130)

where

$$\begin{aligned} \sigma ^{\mathrm {x}}_{ab} = D_{\langle a}u^{\mathrm {x}}_{b\rangle } , \qquad \text{ with } \qquad D_a u^{\mathrm {x}}_b = \perp ^{\mathrm {x}}_{ac} \perp ^{\mathrm {x}}_{bd} \nabla ^c u_{\mathrm {x}}^d , \end{aligned}$$
(16.131)

where the angular brackets indicate symmetrization and trace removal (as in (12.39)),

$$\begin{aligned} \varpi ^{\mathrm {x}}_{ab}= & {} D_{[a} u^{\mathrm {x}}_{b]} , \end{aligned}$$
(16.132)
$$\begin{aligned} \theta ^{\mathrm {x}}= & {} \nabla _a u_{\mathrm {x}}^a , \end{aligned}$$
(16.133)

and

$$\begin{aligned} \dot{u}^{\mathrm {x}}_a = u_{\mathrm {x}}^b \nabla _b u^{\mathrm {x}}_a . \end{aligned}$$
(16.134)

With these definitions, each term in (16.130) is orthogonal to \(u_{\mathrm {x}}^b\). From the fact that \({S}^{\mathrm {x}}_{a b}\) is symmetric and orthogonal to \(u_{\mathrm {x}}^a\) it is easy to see that the condition inferred from (16.129) is satisfied provided that we have

$$\begin{aligned} D^\mathrm {n}_{ab} = \eta ^\mathrm {n}\sigma ^\mathrm {n}_{ab} + \zeta ^\mathrm {n}\theta ^\mathrm {n}\perp ^\mathrm {n}_{ab} , \end{aligned}$$
(16.135)

with \(\eta ^\mathrm {n}>0\) and \(\zeta ^\mathrm {n}>0\). We recognise this as the dissipative (shear- and bulk viscosity) stresses expected in the Navier-Stokes equations. Interestingly, the second law of thermodynamics was not engaged in the derivation of this result.

Finally, let us consider the entropy condition. Making use of the results from the simpler heat example, noting that we can still use (16.113) for \(R^{\mathrm {s}\mathrm {n}}_a\), we see that (16.127) leads to

$$\begin{aligned} T \varGamma _\mathrm {s}= \gamma R_w w^2 - D^\mathrm {s}_{ab} \nabla ^b u_\mathrm {s}^a \ge 0 , \end{aligned}$$
(16.136)

as required by the second law. This suggests that, in addition to \(R_w>0\) from before, we should have

$$\begin{aligned} D^\mathrm {s}_{ab} =- \eta ^\mathrm {s}\sigma ^\mathrm {s}_{ab} - \zeta ^\mathrm {s}\theta ^\mathrm {s}\perp ^\mathrm {s}_{ab} , \end{aligned}$$
(16.137)

with \(\eta ^\mathrm {s}>0\) and \(\zeta ^\mathrm {s}>0\).

This example provides an indicative illustration, but it is (by no means) the most general model one may envisage, see Andersson et al. (2017a).

16.11 A few comments

The development of practical models—suitable for applications—for dissipative relativistic fluids remains very much a “work in progress”. Having said that, there have been a number of recent potentially promising developments. We have covered the main ideas here, starting from phenomenological models constructed to incorporate dissipative effects. The most “obvious” strategies—the “text-book” approach of Eckart (1940) and Landau and Lifshitz (1959)—fail completely, as they do not respect causality and have stability issues. Going further, we described how the problems can be fixed by introducing additional dynamical fields. We considered the formulations of Stewart (1977), Israel and Stewart (1979a, 1979b) and Carter (1991) in detail. From our discussion it should be clear that these models are examples of an extremely large family of possible theories for dissipative relativistic fluids. Given this wealth of possibilities, can we hope to find the “correct” model? To some extent, the answer to this question relies on the extra parameters one has introduced in the theory. Can they be constrained by observations? This question has been discussed by Geroch (1995) and Lindblom (1996). The answer seems to be no, we should not expect to be able to use observations to single out a preferred theoretical description. The reason for this is that the different models relax to the Navier–Stokes form on very short timescales. Hence, one will likely only be able to constrain the standard shear and bulk viscosity coefficients, etc. Related questions concern the practicality of the different proposed schemes. To a certain extent, this is probably a matter of taste. Of course, it may well be that the additional parameters required in a particular model are easier to extract from microphysics arguments. With this in mind, we introduced a fairly recent development aimed at extending the variational approach to dissipative systems (Andersson and Comer 2015). This is conceptually interesting as it draws more directly of the matter space, but it is not yet clear how far this alternative strategy can be pushed. At the end of the day, it may well be that different circumstances require different logic. This would make the “best” formulation a matter of taste. Clearly, there is scope for more thinking...

17 Concluding remarks

In writing (years ago) and updating (over several years) this review, we have tried to develop a coherent description of the diverse building blocks required for fully relativistic fluid models. Although there are alternatives, we opted to base our discussion of the fluid equations of motion on the variational approach pioneered by Taub (1954) and developed further by Carter (1983, 1989a, 1992). This is an appealing strategy because it leads to a natural formulation for multi-fluid problems and there have been a number of extensions to cover (more or less) the full range of physics one may be interested in. This is reflected in the material that was added as the review was updated. We now go deeper into variational principles in relativity and consider applications ranging from superfluids with quantized vortices to elastic matter and electromagnetism. We also make contact with modern applications by discussing numerical implementations. Finally, the discussion of dissipative systems has been revised to reflect the ongoing discussion of this important, but still challenging problem. These changes are significant, but one could consider going further still. After all, fluids describe physics at many different scales and there is a lot of physics to discuss. The only thing that is certain is that, whatever happens next, we expect to continue to enjoy the learning process!