Introduction

The liver longitudinal water proton relaxation rate R1 is important for several reasons. Native R1 is a biomarker of liver pathology [1, 2]. Also, other liver biomarkers are secondarily derived from R1 measurements: for example, increase in R1 post-gadoxetate is a biomarker of hepatocyte function [3, 4]; extracellular volume is derived by comparing R1 pre and post contrast [5]; and baseline R1 is required for rate constants in dynamic contrast-enhanced MR [6], for tissue oxygen tension in oxygen-enhanced MR [7], and for relaxivity measurements in contrast agent research[8].

Measurements of R1 in individual livers or liver regions suffer from both systematic errors and random errors [9]. Systematic errors (bias) arise because measurements are imperfectly performed. Other systematic deviations occur because different methods, even when perfectly performed, yield R1 values with different dependences on liver composition and physiology. Random (repeatability) errors arise from physiologic and instrument noise, and can be high particularly when regions-of-interest are small. In addition, even in the absence of bias and noise, there are, in each study, genuine between-subject differences in R1 due to between-subject variation in physiology or subclinical pathology.

To mitigate the effects of random error in establishing a “normal” or “baseline” liver R1, investigators sometimes employ a "compromise" R1, averaged from all subjects in their study. This likely reduces the "noise" variance, but introduces other errors by ignoring true between-subject variation. Other investigators may obtain R1 from literature reports, although this will introduce additional bias if different measurement methods had been used, or different populations had been studied.

The aim of this study was to survey values, and variabilities, of normal liver R1 from the published literature. This would give investigators an indication of whether the liver R1 or T1 values and variabilities they measure are broadly consistent with, or discordant from, the prior literature.

Methods

Literature searching

Literature was searched manually using "Ovid Medline" (www.ovid.com) for “magnetic resonance imaging” AND “liver” AND “relaxation”. Additional literature reports were retrieved from citations, supplemented by a more intensive search for data with B0 = 4.7 T, 7 T, 9.4 T, 11.7 T, 14.1 T or 21.1 T (see supplementary material 1 for further details). Liberal inclusion criteria were employed: any report, in any language, which claimed to measure liver R1 or T1 was included, irrespective of methodology or study design. Studies where B0 was unclear, or where liver R1 or T1 was measured but not reported, were necessarily excluded. Studies using Look-Locker methods were included if they reported T1 or R1, but excluded if they reported an apparent T1* only. Human and rodent subjects were included if they were normal controls of any age, if the study reported normal parts of livers with focal disease, or if they were patients in whom no liver abnormality had been found. Studies of definitely pathological liver, suspected duplicates, and ex vivo studies were excluded.

Analysis

The mean and variance of R1 across all subjects in each study was estimated from the publications, with the coefficient of variation given by \({\text{CoV}}=\sqrt{\text{variance}}/{\text{mean}}\). Where measurements were made on the same subjects using the same method (repeatability), the weighted mean ± SD was used, however where measurements were made on the same subjects using different method (e.g., different field strengths) the measurements were treated as if from two different studies. Any R1 measurement method was allowed, as long as T1 (s) or R1 (s−1) was reported. Where T1 ± SD was reported, a point estimate of R1 was estimated as T1−1 and the between-subject variance in R1 was estimated (see supplementary material 2) as:

$$ 0.25\left( {\left( {\left( {T_{1} - SD} \right)^{ - 1} } \right) - \left( {\left( {T_{1} + SD} \right)^{ - 1} } \right)} \right)^{2} $$
(1)

In a few cases, the between-subject variance in R1 was estimated from a bar or scatterplot depicted in the publication, or from the range rule [10]. To aggregate the data, individual studies were weighted by the inverse of their between-subject variance in R1. Studies with N = 1, or where a variance could not be extracted, were included in Figs. 1 and 2, but their R1 was assigned zero weight in the fits. In addition, a method to account for the well-known B0-dependence of liver R1 [11,12,13,14,15] was needed. Two methods of representing this B0 dependence were used: a heuristic log–log relationship, and a biophysical power-law model developed by Diakova et al. [12]. R1 was fitted to B0 using the weighted non-linear least squares function nls() in R[16] (see supplementary material 3). The fitted parameters in the heuristic were M and C:

$$ \log \left( {R_{1} } \right) = M\log \left( {{\text{B}}_{0} } \right) + C $$
(2)
Fig. 1
figure 1

Log–log dependence of longitudinal relaxation rate on field strength. Blue: human; Red: rat; Green: mouse. Each symbol represents one study. Size of circle reflects number of subjects (some smaller symbols are occluded by larger symbols). Dashed black line: fit to Eq. 2. Solid black line: fit to Eq. 3 with \({R}_{1,\infty }\)= 0.213 s−1. The dotted line illustrates, for the benefit of investigators working at > 10 T, fits to Eq. 3 where \({R}_{1,\infty }\) was fixed at higher values of 0.4 s−1, 0.6 s−1, and 0.8 s−1, intermediate between 0.213 s−1 and the 0.9–1.0 s−1 value observed at 9.4 T in Table 1

Fig. 2
figure 2

Dependence of longitudinal relaxation rate on field strength. Each symbol represents one study. Dashed black line: Eq. 2. Solid black line: Eq. 3. Dotted line: \({R}_{1,\infty }=0.213 {s}^{-1}\)

The fitted parameters in the model were A and B:

$$ R_{1} = A\omega^{k} + B\tau_{D} \left[ {\ln \left( {1 + \left( {\omega \tau_{D} } \right)^{ - 2} } \right) + 4\ln \left( {1 + \left( {2\omega \tau_{D} } \right)^{ - 2} } \right)} \right] + R_{1,\infty } $$
(3)

where \({R}_{1,\infty }\) is the high-frequency asymptote, i.e., the extreme narrowing condition, set here to 0.213 s−1 at 310 K[17]; \({\tau }_{D}\) is the translational correlation time from Diakova et al. [12] adjusted for temperature to 1.43 × 10–11 s; \(k=-0.6\) also from Diakova et al. [12]; and \(\omega =2\pi \times 42.58\times {10}^{6}\times {\text{B}}_{0}\) s−1. In the summaries, lower (LQ) and upper (UQ) quartiles, and medians, are reported. For exploratory fits using other weightings, see Supplementary Material 4.

Results

Approximately 500 publication abstracts were read, from which around 270 publications were selected and reviewed. After exclusions, 116 publications remained, with publication dates between 1981 and 2020. Some publications reported multiple studies, or multiple groups within a single study, so that 143 studies were available to contribute to this analysis. These represented 3392 humans [1,2,3,4, 7, 11, 14, 15, 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94], 99 mice [95,96,97,98,99,100,101,102,103,104,105] and 249 rats [5, 33, 105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126]. The number of subjects per study varied between 1 and 1037 (median 12). A very wide variety of T1 measuring methods was used. Frequently used approaches (see supplementary material 5) were inversion-recovery (18% of studies), saturation-recovery (21%) or variable-flip-angle (10%), which compare signal arising respectively when inversion time, repetition time, or flip angle are incremented. The median number of increments was 3 (range 2–20). Various read-outs were employed including spin-echo, gradient-echo, echo-planar or localized spectroscopy. Other studies employed variants of Look-Locker (24%) or MR fingerprinting (1%). Some studies reported that they suppressed fat, and/or corrected for iron-induced T1-shortening; some reported motion suppression, registration, triggering, gating or breath-hold; some reported B1 correction or phantom-based validation. Some studies analysed quite small regions of interest often avoiding blood vessels and bile ducts; others included most or all of the liver. Seventeen field strengths were included between 0.04 T and 9.4 T. No values were found in reports using B0 > 9.4 T: one report of \({T}_{1}^{*}=1.0\pm 0.1\mathrm{ s}\) at 14.1 T was excluded[127]. Figures 1 and 2 show plots of R1 against B0, in which R1 shows the expected decrease with increasing field: Table 1 gives values for the most important field strengths. The fit to Eq. 2 gave \(M=-0.3611\pm 0.0115\) and \(C=0.2956\pm 0.0073\). The fit to Eq. 3 gave \(A=(8.663\pm 0.681)\times {10}^{4}\) and \(B=(1.294\pm 0.082)\times {10}^{9}\). An exploratory attempt at a three-parameter fit to Eq. 3 (i.e., to A, B, and \({R}_{1,\infty }\)) failed to provide evidence for \({R}_{1,\infty }>0\) (supplementary material 4). When data were subgouped by species or by method, no evidence was found that the subgoup R1 values deviated systematically from Eq. 3 (supplementary material 6). Across all studies, the median between-subject CoV was 9.1% (LQ 5.9%, UQ 16.5%, rms 17.0%). There was, however, a tendency for early studies to report high between-subject CoV (Fig. 3 and supplementary material 7): no study published after 1992 had CoV ≥ 20%, and for post-1992 studies the median between-subject CoV was 7.4% (LQ 5.6%, UQ 11.0%, rms 9.6%). In half those studies, the measured R1 deviated from Eq. 3 by 8.0% or less (LQ 2.8%, UQ 16.6%).

Table 1 Preferred R1 values (s−1) for five commonly used field strengths, derived from the data and from the fits
Fig. 3
figure 3

Within-study between-subject coefficient of variation as a function of year of publication

At each field strength, there was considerable variation in R1 between studies: the between-study CoV was 16% for post-1992 studies. Six publications[2, 37, 98, 119, 128, 129] also reported liver R1 repeatability (same subject, different scan, same measurement conditions): the rms CoV was 1.9%. These CoVs allowed a crude estimate (supplementary material 8) of the relative size of the three main variance components: repeatability variance contributed ~ 1%; within-study-between-subject variance contributed ~ 25%; and between-study variance contributed ~ 74%.

Discussion

In liver, as in pure water, both intramolecular and intermolecular water 1H-1H dipolar relaxation contribute to R1. Specific additional contributors to water 1H R1 in liver arise from 1H-1H dipolar relaxation between water and other molecules, and 1H-electron dipolar relaxation between water and various iron- or copper-containing substances or dioxygen. These 1H-containing and unpaired-electron-containing substances differ in concentration between subjects. The liver 1H resonance arises mostly from tissue water in hepatocytes. Other contributions come from water in other intracellular compartments (e.g., Kupffer cells, erythrocytes), and in extracellular compartments (e.g., bile, plasma, space of Disse). Signal from triglyceride and inflowing blood may contribute, depending on the sequence used. Macromolecules contribute to the signal, notably collagen and glycogen which have different concentrations in different subjects. These factors likely account for some of the variation between subjects and between studies.

Fits from the heuristic and from the model were very similar. The main difference is that the heuristic forces R1 to zero at infinite field, while the model forces R1 to asymptote in the extreme narrowing condition. This difference might become important at fields above 7 T (Fig. 1). In this study, following Diakova et al.[12], the asymptote \({R}_{1,\infty }\) was fixed at 1/4.7 s−1, equal to the R1 of pure deoxygenated water at 310 K at high field [17]: a slightly higher value would be more appropriate if R1 values from liver water and pure water do not converge as illustrated in Fig. 1.

The relative magnitude of the major variance components was estimated. This is very crude, and given the heterogeneity and variable quality of the raw data, should be considered a rough guide only. The within-study between-subject CoV reflects not only repeatability error (~ 1% of the variance), but also the expected between-subject variation (~ 25% of the variance). Between-study variation (~ 74% of the variance) also includes between-population variation, together with bias from interactions between each study’s measurement method and its livers’ variation in flow, motion, fat, oedema, collagen, glycogen and iron. R1 may also change after a meal [89], during the menstrual cycle [25] or with drug treatment [25].

The literature survey was not fully PRISMA-compliant [130] and is unlikely to be complete. Studies explicitly of liver R1 or T1 as a biomarker are readily retrieved, because appropriate keywords are generally used in the title and abstract. However, for studies where liver R1 or T1 measurement is incidental to another objective, for example extracellular volume, relaxivity, or dynamic contrast-enhanced studies, suitable keywords may not have been included.

There is no single “correct” value for any liver’s 1H R1. R1 may vary spatially across the liver [60, 119]. Water 1H R1 is multiexponential, particularly with sequences where macromolecule-associated fast-relaxing water contributes to the measurement. Other substances in the liver may also contribute to the 1H signal, such as glycogen [87] or triglyceride [76, 131]. Inflowing blood [110, 132], physiologic motion [71], magnetization transfer, and iron affect the measured R1 in ways which depend both on the sequence and on the analysis employed. There may be systematic differences in R1 between fat-suppressed vs. non-fat-suppressed acquisitions; 2D acquisitions more vulnerable to inflow effects than 3D; breathhold or gated vs. free-breathing; and so on. Some investigators advocate the use of a “corrected” T1 to avoid bias caused by the relaxivity of iron-containing substances [65]. Because of these biases in the literature, studies which deviate from these survey data should not immediately be considered “incorrect”, but if large deviations are observed, then an explanation on methodological or physiological grounds should be sought.

There are some other limitations. While some publications reported carefully designed and conducted biomarker validation studies, in other publications, the precise value of T1 was only of incidental interest and possibly acquired with less care. However, in this survey, the study design and objectives were not incorporated into the weightings. Most studies did not report validation of their liver R1 by means of a phantom, so accuracy is unknown. It was difficult to explore the effect of methodology on R1, because some studies used methodology which was poorly described or did not appear robust, and because of correlation between field strength and methodology (old studies used old methodology and lower fields). Likewise, there was correlation between field strength and species (humans at low-medium fields, rats at medium–high fields and mice at high fields), so it was difficult to compare between species.

Conclusion

Quantitative relaxometry requires validation with phantoms and analysis of propagation of errors. However, it is also good scientific practice to compare one’s own findings with prior literature. An investigator who finds their average liver R1 in normal liver to be within 8% of the fit to Eq. 3, with between-subject CoV < 8%, can conclude that their measurements are in agreement with the majority of the literature: for measurements far outside these limits, a physiological or methodological explanation should be sought.