Introduction

Diffusion-weighted imaging (DWI) is a form of magnetic resonance (MR) imaging widely used to non-invasively evaluate the brain by measuring the displacement of water molecules in biological tissues in vivo [1, 2]. Among the available DWI techniques, diffusion tensor imaging (DTI) [3] is most commonly used to observe brain microstructural changes in neurological abnormalities [4, 5]. Although commonly used in neurological research, the use of DTI is limited in multiple ways. First, DTI is insufficient for modeling non-Gaussian diffusion scatter patterns in biological structures [6, 7]. Second, despite its sensitivity, DTI metrics are not tissue-specific. For example, a decrease in fractional anisotropy (FA) may be attributed to either or both of these: (A) loss of structural integrity (such as axonal loss or demyelination) and (B) increase in the complexity of tissue structure (such as increase in axon size and packing density and change in the degree of axonal dispersion). Finally, DTI is not the preferred method for evaluation of gray matter (GM) (particularly the cortex) because it cannot thoroughly describe microstructural abnormalities in GM due to isotropic water diffusion [8].

Neurite orientation dispersion and density imaging (NODDI) was introduced by Zhang et al. [9] to overcome the limitations of DTI. NODDI is a multi-compartment diffusion imaging model that measures microstructural metrics from multi-shell diffusion MRI data acquired using a clinical scanner within a clinically feasible period [9]. NODDI assumes a three-compartment biophysical tissue model, including intracellular (restricted diffusion; modeled by sticks), extracellular (hindered diffusion; modeled by parallel and perpendicular diffusion in an anisotropic tensor), and cerebrospinal fluid compartments (free diffusion; modeled by an isotropic tensor) within a single voxel based on the orientation-dispersed cylinder model in accordance with Watson distribution [9]. Intracellular volume fraction (ICVF) and orientation dispersion index (ODI) are the two main output metrics for NODDI that reflect neurite density and neurite orientation and dispersion, respectively, thereby disentangling the two facets of FA [9].

NODDI metrics have been identified as useful diagnostic biomarkers for revealing microstructural changes in the brains of patients with Alzheimer’s disease [10], Parkinson’s disease [7, 11, 12], stroke [13], and multiple sclerosis [14, 15]. Recently, NODDI has been used to differentiate brain tumors [16] and explore white matter (WM) microstructure in very preterm-born children [17]. In addition, NODDI has been used to demonstrate neurite properties in the human cerebral cortex that are correlated with the myeloarchitecture [18]. ICVF has exhibited good correlations with the histological measurements of hyperphosphorylated tau levels in the GM of a human tauopathy mouse model, unlike the traditional DTI metrics of mean diffusivity (MD) and FA, which failed to exhibit any correlation [19].

Chung et al. [6], Huber et al. [20], and McCunn et al. [21] attempted to assess the scan–rescan reproducibility of NODDI metrics in 8 human subjects using 1.5-T and 3-T MR scanners, a group of children (ages 7–12 years) using a 3-T MR scanner, and 10 adult Sprague–Dawley rats using a 9.4-T MR scanner, respectively; they demonstrated that ICVF and ODI are highly reproducible. However, to the best of our knowledge, no study has explored the reproducibility of NODDI metrics in the human brain across different MR scanners from different vendors to date. Therefore, in this study, we aimed to evaluate the scan–rescan and inter-vendor reproducibility of NODDI metrics in WM and GM of healthy subjects using two 3-T MR scanners from two different vendors.

Materials and methods

Study participants

A total of 10 healthy subjects (7 males and 3 females; mean age 30 ± 7 years, range 23–37 years) with no history of neurological, psychiatric, or other systemic diseases were included in the study. The Institutional Review Board of Juntendo University Hospital, Tokyo, Japan approved this study, and all subjects gave written informed consent prior to participation.

Imaging protocol

Each subject was scanned on two sessions scheduled at least 1 day apart using two 3-T MRI scanners (Vantage Galan ZGO, Canon Medical Systems, Otawara, Japan (scanner A) and MAGNETOM Prisma, Siemens Healthcare, Erlangen, Germany (scanner B)), both located at one site. All subjects were scanned twice in each session to assess the scan–rescan reproducibility. Each volunteer was removed from the scanner briefly following the first acquisition and repositioned for the second acquisition.

Whole-brain DWI was acquired using a 2D multiband spin-echo echo-planar imaging (EPI) sequence [22] with b-values of 1000 and 2000 s/mm2, each with 64 motion-probing directions. Each DWI acquisition was completed with one b0 image without diffusion gradients. Standard and reverse phase-encoded blipped images without diffusion weighting (blip up and blip down) were also acquired to correct for magnetic susceptibility-induced distortions related to EPI acquisitions [23]. We also obtained 3D T1-weighted image with magnetization-prepared rapid gradient echo (MPRAGE) with 180° radiofrequency pulse. The sequence parameters of each scanner are shown in Table 1.

Table 1 Acquisition parameters

Pre-processing for diffusion MRI

The diffusion MRI data were corrected for susceptibility-induced geometric distortions, eddy current distortions, and inter-volume subject motion using EDDY and TOPUP toolboxes [23]. Next, all diffusion MRI data were visually checked for 64 different directions in axial, sagittal, and coronal planes for both scanners. We confirmed that all data were free from severe artifacts, such as gross geometric distortion, signal dropout, and bulk motion.

The resulting images were fitted to the NODDI model [9] using the NODDI MATLAB Toolbox 5 (http://www.nitrc.org/projects/noddi_toolbox); then, ICVF, ODI, and isotropic volume fraction (ISO) maps were generated. The diffusion tensor was estimated using ordinary least squares applied to diffusion-weighted images with b-values of 0 and 1000 s/mm2. FA, MD, axial diffusivity (AD), and radial diffusivity (RD) maps were then generated for all subjects using the DTIFIT tool implemented in functional magnetic resonance imaging of the brain (FMRIB) Software Library 5.0.9 (FSL, Oxford Centre for Functional MRI of the Brain, UK; www.fmrib.ox.ac.uk/fsl) to fit the tensor model to each voxel of the DWI data [3].

Signal-to-noise ratio calculation

Signal-to-noise ratio (SNR) was calculated for each scanner using the single region of interest (ROI) approach and two b0 images with Camino [24]. Manual ROIs were drawn in the genu and splenium of the corpus callosum on the b0 images in the sagittal plane obtained on the first scan for each scanner (Fig. 1a). First, σdiff was calculated as follows:

$$ {\sigma}_{\mathrm{diff}}=\frac{stddev\ \left({S}_{\left\{i1\right\}}-{S}_{\left\{i2\right\}},\dots, {S}_{\left\{N1\right\}}-{S}_{\left\{N2\right\}}\right)}{sqrt(2)} $$

where S{i1} is the signal from voxel i of image 1, whereas S{i2} is the signal from the same voxel in image 2, and N represents the number of voxels in an ROI. Then, SNR was calculated as the mean signal from the ROI divided by σdiff, as follows:

$$ {SNR}_{\mathrm{diff}}=\frac{mean\ \left({S}_{\left\{i1\right\}}+{S}_{\left\{i2\right\}}\right)}{2.0\times {\sigma}_{\mathrm{diff}}} $$
Fig. 1
figure 1

a Placement of regions of interests on b0 images of the genu (red) and splenium (green) of the corpus callosum for the measurement of signal-to-noise ratio. b Boxplots of group mean signal-to-noise ratio for scanners A and B in the genu and splenium. *Wilcoxon signed-rank test significance at P < 0.05

ROI analysis

DTI and NODDI values were measured for the whole WM and GM, subcortical GM, localized WM, and cortical regions. Whole WM and GM, subcortical GM, and cortical segmentation were performed with FreeSurfer pipeline (http://surfer.nmr.mgh.harvard.edu/fswiki) as previously described [25] using 3D MPRAGE T1-WI.

Whole WM and GM, subcortical GM (caudate, putamen, pallidum, thalamus, hippocampus, amygdala, and accumbent), and cortical (frontal, temporal, parietal, occipital, and cingulate) regions were then labeled using the Desikan–Killiany atlas [26]. For localized WM areas, FA maps of all subjects were first realigned according to the FA template of the Johns Hopkins University International Consortium for Brain Mapping (JHU ICBM) using the FMRIB’s nonlinear image registration tool [27]. The corresponding MD, AD, RD, ICVF, ODI, and ISO maps were subsequently realigned according to the transformation parameter obtained from the FA maps. Localized WM (genu, body, and splenium of the corpus callosum, corticospinal tract, anterior and posterior limb of internal capsule, anterior, superior, and posterior corona radiata; posterior thalamic radiation; sagittal stratum; external capsule; superior longitudinal fasciculus, superior fronto-occipital fasciculus; and uncinate fasciculus) regions were labeled with JHU ICBM-DTI-81 WM labels [28]. Lastly, the average diffusion metric was averaged over the region delineated by those atlases for all subjects.

Statistical analysis

All statistical analyses were performed using the IBM SPSS Statistics for Windows (version 22.0; IBM Corporation, Armonk, NY, USA). The Shapiro–Wilk test was used to assess the normality of the SNR data. Not all data were normally distributed; therefore, differences in SNR measured in the genu and splenium of the corpus callosum between scanners A and B from the first scan were analyzed using the Wilcoxon signed-rank test. The threshold for statistical significance was set at p value < 0.05.

Coefficient of variation (CoV) was determined to evaluate the scan–rescan and inter-vendor reproducibility using the following equation:

$$ CoV\ \left(\%\right)=\left( Standard\ deviation/ Mean\right)\times 100 $$

The inter-vendor CoV was calculated using the average values from each scanner. For each subject, the inter-vendor CoVs were calculated using the data of the first scan, which was averaged into a single inter-scanner CoV value. The scan–rescan CoVs were calculated for each subject and then averaged across all subjects.

In addition, we used intraclass correlation coefficient (ICC) with 95% confidence interval. ICC values less than 0.50 were indicative of poor reliability, values between 0.50 and 0.75 were indicative of moderate reliability, values between 0.75 were indicative of good reliability, and values greater than 0.90 were indicative of excellent reliability [29].

Results

For the genu and splenium, the SNR of scanner A was significantly lower than that of scanner B (Fig. 1b). The scan–rescan DTI and NODDI maps of one healthy participant for the two scanners are represented in Fig. 2. Figures 3 and 4 present the mean values of DTI and NODDI metrics, respectively, for WM and GM.

Fig. 2
figure 2

Diffusion tensor imaging (fractional anisotropy [FA], mean diffusivity [MD], axial diffusivity [AD], and radial diffusivity [RD]) and neurite orientation dispersion and density imaging (intracellular volume fraction [ICVF], orientation dispersion index [ODI], and isotropic volume fraction [ISO]) maps of one healthy subject

Fig. 3
figure 3

Means and standard deviations of diffusion tensor imaging (fractional anisotropy [FA], mean diffusivity [MD], axial diffusivity [AD], and radial diffusivity [RD]) metrics across all subjects. Abbreviations: ACR anterior corona radiata, ALIC anterior limb of internal capsule, CC corpus callosum, CST corticospinal tract, GM gray matter, PCR posterior corona radiata, PLIC posterior limb of internal capsule, PTR posterior thalamic radiation, SCR superior corona radiata, SFOF superior fronto-occipital fasciculus, SLF superior longitudinal fasciculus, UF uncinate fasciculus, WM white matter

Fig. 4
figure 4

Means and standard deviations of neurite orientation dispersion and density imaging (intracellular volume fraction (ICVF), orientation dispersion index (ODI), and isotropic volume fraction [ISO]) metrics across all subjects. Abbreviations: ACR anterior corona radiata, ALIC anterior limb of internal capsule, CC corpus callosum, CST corticospinal tract, GM gray matter, PCR posterior corona radiata, PLIC posterior limb of internal capsule, PTR posterior thalamic radiation, SCR superior corona radiata, SFOF superior fronto-occipital fasciculus, SLF superior longitudinal fasciculus, UF uncinate fasciculus, WM white matter

Table 2 shows the scan–rescan CoVs of DTI and NODDI metrics. In WM, the highest scan–rescan CoVs were 1.9% (FA), 3.3% (MD), 2.5% (AD), and 4.1% (RD) and 1.4% (ICVF), 3.8% (ODI), and 18.5% (ISO) for DTI and NODDI metrics, respectively, in scanner A. On the other hand, in scanner B, the CoVs were 0.9% (FA), 1.7% (MD), 1.3% (AD), and 2.0% (RD) and 1.5% (ICVF), 2.6% (ODI), and 9.1% (ISO) for DTI and NODDI metrics, respectively. In GM, the highest scan–rescan CoVs were 2.0% (FA), 2.7% (MD), 2.1% (AD), and 3.1% (RD), and 0.8% (ICVF), 0.8% (ODI), and 4.8% (ISO) for DTI and NODDI metrics, respectively, in scanner A. On the other hand, in scanner B, the CoVs were 0.5% (FA), 0.8% (MD), 0.8% (AD), and 0.8% (RD) and 0.8% (ICVF), 0.5% (ODI), and 3.5% (ISO) for DTI and NODDI metrics, respectively.

Table 2 Scan–rescan coefficient of variation (CoV [%]) across all subjects

Table 3 shows the inter-vendor CoVs of DTI and NODDI metrics. For all metrics, the inter-vendor CoVs were generally higher than the scan–rescan CoVs. In WM, the highest inter-vendor CoVs for DTI metrics were 2.6% (FA), 3.9% (MD), 5.3% (AD), and 5.9% (RD), whereas those for NODDI metrics were 7.0% (ICVF), 14% (ODI), and 38.9% (ISO). In GM, the highest inter-vendor CoVs for DTI metrics were 7.1% (FA), 4.7% (MD), 5.1% (AD), and 5.0% (RD), whereas those for NODDI metrics were 2.3% (ICVF), 4.3% (ODI), and 13.9% (ISO).

Table 3 Inter-vendor coefficient of variation (CoV [%]) across all subjects

Table 4 shows the scan–rescan ICCs of DTI and NODDI metrics. In WM, scanner A and scanner B demonstrated poor to excellent scan–rescan reproducibility of DTI (scanner A: FA [ICC = 0.744–0.995], MD [ICC = 0.777–0.968], AD [ICC = 0.964–0.994], and RD [ICC = 0.884–0.980]; scanner B: FA [ICC = 0.918–0.996], MD [ICC = 0.905–0.987], AD [ICC = 0.912–0.997], and RD [ICC = 0.926–0.997]) and NODDI metrics (scanner A: ICVF [ICC = 0.773–0.989], ODI [ICC = 0.910–0.996], and ISO [ICC = 0.211–0.945]; scanner B: ICVF [ICC = 0.909–0.987], ODI [ICC = 0.789–0.998], and ISO [ICC = 0.133–0.997]). In GM, scanner A and B also demonstrated poor to excellent scan–rescan reproducibility of DTI (scanner A: FA [ICC = 0.383–0.890], MD [ICC = 0.731–0.980], AD [ICC = 0.769–0.985], and RD [ICC = 0.690–0.974]; scanner B: FA [ICC = 0.810–0.984], MD [ICC = 0.593–0.956], AD [ICC = 0.450–0.983], and RD [ICC = 0.703–0.984]) and NODDI metrics (scanner A: ICVF [ICC = 0.668–0.952], ODI [ICC = 0.729–0.926], and ISO [ICC = 0.396–0.975]; scanner B: ICVF [ICC = 0.812–0.963], ODI [ICC = 0.929–0.976], and ISO [ICC = 0.915–0.972]).

Table 4 Scan–rescan intraclass correlation coefficient across all subjects

Table 5 shows the inter-vendor ICCs of DTI and NODDI metrics. In WM, DTI and NODDI metrics showed poor to excellent inter-vendor reproducibility (DTI: FA [ICC = 0.538–0.973], MD [ICC = 0.214–0.890], AD [ICC = 0.119–0.929], and RD [ICC = 0.411–0.949]; NODDI: ICVF [ICC = 0.300–0.935], ODI [ICC = 0.181–0.962], and ISO [ICC = 0.013–0.545]). In GM, DTI metrics showed poor to moderate inter-vendor reproducibility (FA [ICC = 0.013–0.528], MD [ICC = 0.095–0.488], AD [ICC = 0.133–0.416], and RD [ICC = 0.084–0.596]) and NODDI metrics showed poor to excellent inter-vendor reproducibility (ICVF [ICC = 0.395–0.849], ODI [ICC = 0.043–0.580], and ISO [ICC = 0.092–0.903]).

Table 5 Inter-vendor intraclass correlation coefficient across all subjects

Discussion

This study explored the scan–rescan and inter-vendor reproducibility of NODDI metrics (ICVF, ODI, and ISO) obtained using two MR scanners from different vendors in a single-institution setting. Using CoV and ICC analyses, NODDI metrics (ICVF and ODI) in the WM and GM demonstrated comparable scan–rescan and inter-vendor reproducibility with DTI metrics (FA, MD, AD, and RD). In general, however, the reproducibility of ISO was lower compared with the other measured metrics. Also, the inter-vendor reproducibility of all metrics was lower compared with scan–rescan reproducibility.

In contrast to a study by Chung et al. [6], who demonstrated higher scan–rescan CoVs for NODDI metrics than for DTI metrics in the human brain, our study showed that the scan–rescan reproducibility of NODDI metrics using both scanners is comparable with that of DTI metrics (NODDI: ICVF = 0.3–1.5%, ODI = 0.2–3.8% and DTI: FA = 0.2–2.0%, MD 0.2–3.3%, AD = 0.1–2.5%, and RD = 0.2–4.1%). Overall, our study also found lower CoVs of NODDI than those reported by Chung et al. [6] (0.6–7.3%). These results possibly demonstrate that higher-angular resolution pulse sequence (64 directions in this study vs. 20 directions in the previous study) provided more robust diffusion estimates [30]. Our results are consistent with those reported in recent studies that assessed the test–retest reproducibility of DTI metrics acquired for human subjects with 3-T MR scanners using 30 [2] and 64 gradient directions [31] and showed CoVs of < 7% and < 5%, respectively, in whole WM.

DTI and NODDI metrics had higher scan–rescan reproducibility than inter-vendor reproducibility, possibly reflecting cross-scanner differences in the absolute measures of diffusivity, but these results might also be secondary to biological variability [32]. Scan–rescan was done on the same day, but scans on different MR scanners were done on different days. This might have also contributed to the higher inter-vendor differences. Further, the scanners used different head coils (scanner A: 32 channels and scanner B: 64 channels). The differences between the coils affect unfolding methods and performance in parallel and multiband imaging [33]. A larger number of channels in a coil intrinsically lead to higher SNR, particularly in surface regions [34]. In addition, there may have been differences in imaging conditions and environments depending on settings for each vendor that may also explain the inter-vendor differences in terms of SNR. In fact, the SNRs of scanner B for the genu and splenium of the corpus callosum were significantly higher than those of scanner A. Lower SNR has been shown to cause bias in the measurement of diffusion measures [35]. Thus, lower SNR of scanner A may also have contributed to lower inter-vendor reproducibility. Indeed, the scan–rescan CoVs of scanner A were relatively higher than those of scanner B.

Generally, in agreement with the study by Chung et al. [6] investigating scan–rescan reproducibility of NODDI in 1.5 T and 3 T, the reproducibility of NODDI metrics was lower in WM and higher in GM than that of DTI metrics. It has previously been speculated that NODDI metrics are nosier than DTI metrics for modeling WM because NODDI is a more complex model requiring higher b-values [6]. In addition, cardiac pulsation leading to intra-voxel dephasing and inaccurate estimation of anisotropy parameters and tensor orientation possibly increased the variability of NODDI metrics, particularly ODI, in WM [6]. In contrast, NODDI seems to be more robust compared with DTI in the evaluation of GM than on WM, which may be because NODDI metrics serve as a more direct marker for complex and heterogeneous neurobiological features of GM [9, 18].

In line with previous studies [6, 21], ISO in the WM and GM was found to have low scan–rescan and inter-vendor reproducibility compared with the other measured metrics that may be because ISO is highly susceptible to noise [6, 21]. The improvements in SNR are predicted to increase the reproducibility of ISO. In this study, indeed, the reproducibilities of ISO in scanner B are higher than those of scanner A.

Consistent with the observations reported by Zhang et al. [9] and Chung et al. [6], the NODDI maps in our study reflected a spatial pattern of tissue distribution (Figs. 3 and 4), consistent with the known brain anatomy. ICVF, the index of neurite density, values were shown to be lower for GM than for WM and, as expected, ODI, the index of orientation dispersion, were lower in WM but higher in GM (e.g., in WM, the highest ICVF value and lowest ODI value were found in the corpus callosum). Furthermore, FA has been shown to be highly influenced by orientation dispersion [36]. In our study, in line with previous studies [6, 9], ODI and FA exhibited regional variations that are inversely correlated with each other (Figs. 3 and 4).

A major limitation of the present study is the small number of participants who were scanned at a single institution using MRI scanners from only two vendors. To reduce the acquisition time in a clinically feasible manner, DWI data were obtained using a multiband EPI sequence. However, at our institution, the multiband EPI sequence is installed only in MRI scanners from vendors A and B; therefore, we could not expand the study to include additional vendors. Thus, a multi-site study with a larger sample size and more scanners might be needed to demonstrate the robustness of NODDI. In addition, this study was performed using b-values of 1000 and 2000 s/mm2, whereas the optimized NODDI protocols use b-values of 711 and 2855 s/mm2 (acquired in approximately 30 min). However, Zhang et al. [9] demonstrated that there is no significant loss in the accuracy of the metrics using the current protocol. Furthermore, our protocol can be performed in a shorter time (< 15 min), which makes it more feasible in the clinical setting.

Conclusion

In this study, NODDI demonstrated excellent scan–rescan reproducibility that was comparable with DTI. However, lower inter-vendor reproducibility of DTI and NODDI in some areas of the brain indicates that data acquired from different MRI scanners should be carefully interpreted.