Deformable MR-CT image registration using an unsupervised, dual-channel network for neurosurgical guidance
Graphical abstract
Introduction
Navigation relative to preoperative 3D imaging is prevalent in a wide spectrum of neurosurgical treatments, including tumor biopsy (Oppido et al., 2011), cyst resection (Sribnick et al., 2014), hydrocephalus (Spennato et al., 2007), and deep brain stimulation (Groiss et al., 2009; Laxton et al., 2010). Such surgeries are commonly performed through a cranial burr hole and/or endoscopically via the lateral or third ventricles for access to deep brain structures. Magnetic resonance (MR) imaging (commonly T1-weighted MR) offers clear delineation of white and gray matter, CSF, and subcortical structures and is the basis for preoperative planning (e.g., segmentation of the target, eloquent brain, and vessels as well as definition of desired electrode trajectories). Intraoperative CT provides high-resolution visualization of bone and instrumentation during surgery but offers limited soft-tissue contrast.
Even with a minimally invasive approach, deep brain deformations induced by egress of cerebrospinal fluid (CSF) and introduction of instrumentation present a challenge to accurate navigation. Conventional neuro-navigation using stereotactic frames and rigid registration between preoperative MR and intraoperative CT does not address such nonrigid motion, with deformation of deep brain targets up to 10 mm (Nowell et al., 2014) associated with inaccurate targeting and device placement (Nabavi et al., 2001). Deformable registration attempts to solve the non-linear transformation that establishes anatomical correspondence between MR and CT images and transforms preoperative planning into the intraoperative coordinates. A number of methods for multi-modality deformable registration have been reported (Denis de Senneville et al., 2016; Han et al., 2018; Modat et al., 2010; Reaungamornrat et al., 2016; Rueckert et al., 2006) to solve multi-modality deformable registration via iterative numerical optimization, but the high computational load tends to carry long runtimes and may be limited within the context of intraoperative workflow. Recent advances in deep learning-based registration demonstrate robustness and fast runtime over conventional methods, making them an important candidate for further development and translation to clinical application.
Deep learning-based deformable registration methods often use convolutional neural networks (CNNs) to predict either a set of deformation parameters or a full deformation field. Depending on the type of annotation available in training data, deep learning registration approaches can be broadly categorized as:
- (i)
Supervised learning. Supervised learning requires the training dataset to include ground-truth deformation fields. Since the performance of registration depends on the quality of the ground-truth definition, this approach can be limited by the accuracy of the conventional registration used to obtain ground truth (Cao et al., 2018b; Sokooti et al., 2017; Yang et al., 2017). Alternatively, ground-truth can be defined via simulated deformations (Eppenhof and Pluim, 2019; Mahapatra et al., 2018; Sun et al., 2018).
- (ii)
Weakly supervised learning. Weakly supervised learning methods perform optimization on image surrogates, such as segmentation maps or landmarks. For example, Hu et al. (2018); Xu and Niethammer (2019)) demonstrated networks trained to maximize the alignment between tissue labels. Alternatively, Blendowski et al. (2020) used a shape encoder-decoder network to extract cardiac shape representations as a basis for registration. The time-consuming nature of tissue labeling and the dependence of performance of the resulting network on the accuracy of tissue labeling are well recognized.
- (iii)
Unsupervised learning. To overcome the limitations of supervised and weakly supervised learning, unsupervised learning methods have been developed that learn to minimizes losses between fixed and registered images. Loss functions are often based on either similarity metrics such as sum of squared difference (SSD) and normalized cross-correlation (NCC) (Balakrishnan et al., 2018; Cao et al., 2018a; Dalca et al., 2018), or neural network-based “deep metrics” (Haskins et al., 2019; Niethammer et al., 2019).
While a considerable amount of previous work has focused on deep learning-based deformable registration within a single imaging modality (e.g., MR-to-MR registration), multi-modality registration presents a challenging problem. Multi-modality registration commonly relies on some degree of supervision, either ground-truth deformation fields or labeled landmarks/segmentations. Unsupervised, multi-modality deformable registration approaches have been demonstrated that optimize a multi-modality similarity metric, such as mutual information (MI) (Che et al., 2019; Guo, 2019). Such metrics, however, can be insensitive to local spatial information, which can diminish registration accuracy compared to mono-modality metrics.
To mitigate challenges associated with multi-modality similarity metrics, a popular approach is to convert multi-modality registration to a mono-modality registration via image synthesis, allowing optimization according to a mono-modality metric. For example, Liu et al. (2019); Tanner et al. (2018); Wei et al. (2019); Yang et al., 2020a, Yang et al., 2020b used Generative Adversarial Networks (GANs) to generate synthetic CT from MR images and perform mono-modality CT registration. Similarly, Xu et al. (2020) further fused the multi-modality MR-CT and mono-modality CT registration into a single prediction. Such methods, however, only use the MR-to-CT synthesis, and the inverse (CT-to-MR) synthesis was omitted. Alternatively, Qin et al. (2019) used disentangled networks to decouple images into shape and appearance representations, and mono-modality registration was performed on the resulting shape representations.
The method reported below extends previous work using image synthesis for unsupervised, multi-modality MR-CT registration. Inspired by multi-modality and mono-modality fusion (Xu et al., 2020) and multi-channel registration (Chen et al., 2017; Fan et al., 2019), this work utilizes MR-CT synthesis to reduce the registration to two, mono-modality registrations in the MR and CT domains, subsequently fusing the two channels for the final estimate of the deformation. Contributions of this work include:
- (i)
A novel unsupervised, deformable registration network is proposed for MR-CT registration to provide guidance in minimally invasive neurosurgery. The network contains two subnetworks: (1) an image synthesis subnetwork to generate synthetic MR/CT images from the input image pairs; and (2) a dual-channel registration subnetwork that predicts the deformations in MR and CT channels and fuses the two into a final diffeomorphic deformation field.
- (ii)
The image synthesis subnetwork implements a novel probabilistic CycleGAN that generates both the synthetic images and the associated uncertainty. Instead of global averaging of the dual-channel registration loss functions as in conventional dual-channel registration (Chen et al., 2017), the uncertainties are used to provide a principled, spatially varying weighting of the dual channels.
- (iii)
An end-to-end training strategy is employed to jointly optimize image synthesis and registration subnetworks, which guides the synthesis subnetwork in generating intermediate representations that are advantageous to the task of deformable registration.
The paper is organized as follows: in Section 2, the details of the proposed method are described along with an end-to-end training strategy; Sections 3 and 4 present the experimental methods, ablation studies (variations of the algorithm with and without dual-channel fusion and uncertainty weighting), and results comparing the proposed method to two baseline algorithms (symmetric normalization (Avants et al., 2008) and VoxelMorph (Balakrishnan et al., 2018)); and Section 5 demonstrates the effects of dual-channel fusion and end-to-end training. The proposed deformable registration method is tested on a spectrum of datasets, including datasets with a broad variety of simulated deformations, real deformations associated with long-time baseline longitudinal studies, and real deformations induced by neurosurgical intervention.
Section snippets
Algorithmic methods
An unsupervised deformable registration framework is proposed for registering preoperative MR images to intraoperative CT images. Let be the moving preoperative MR image and be the fixed intraoperative CT image defined over a 3D spatial domain . The two images are first rigidly aligned (alternatively, affine registration if scaling or skew difference is observed) as a preprocessing initialization step, such that the network only learns the nonlinear local deformation – an essential
Image datasets
Three datasets were used in training, validation, and testing of the proposed method. The first dataset contained 50 paired T1-weighted MR and CT images acquired on the same day without neuro-intervention or evidence of deformation. These images were used to create a large dataset with simulated deformations as detailed below. A second dataset consisted of 9 MR images with the same MR scan protocols as the first dataset along with 9 corresponding CT images with real deformations that were
MR-CT image synthesis
The performance of the intermediate MR-CT synthesis from probabilistic CycleGAN was first examined. A series of CycleGAN models were trained in this work according to Table 1, including sequential (SEQ) training and the end-to-end (E2E) training variations (E2E:CT, E2E:MR, and E2E:2CH+U). This section details the results from SEQ training, providing a baseline evaluation for the case in which probabilistic CycleGAN is trained on its own (separate from registration). Performance comparison of
Single-channel vs. dual-channel registration
The performance of the dual-channel registration with uncertainty weighting compared to the ablation variations (SEQ:CT, SEQ:MR, SEQ:2CH, and SEQ:2CH+U) demonstrated several findings with respect to the effects of MR, CT, and the combination of MR and CT on deformable image registration. First, using the MR channel alone (SEQ:MR) showed higher performance than using the CT channel alone (SEQ:CT) except for registration of the lateral ventricles, where comparable performance was achieved from
Conclusions
An unsupervised, dual-channel network for MR-CT deformable registration was reported. The method uses a probabilistic CycleGAN for MR-CT image synthesis and a dual-channel registration to predict and fuse the deformation field in both MR and CT channels. The image synthesis uncertainties, a representation of the aleatoric uncertainty, are used as spatially varying weights to balance the contributions of the MR and CT channel registration loss functions. In addition to a conventional sequential
CRediT authorship contribution statement
R. Han: Conceptualization, Methodology, Software, Investigation, Writing – original draft. C.K. Jones: Validation, Writing – review & editing. J. Lee: Data curation, Supervision, Validation, Writing – review & editing. P. Wu: Resources, Writing – review & editing. P. Vagdargi: Resources, Writing – review & editing. A. Uneri: Resources, Writing – review & editing. P.A. Helm: Supervision. M. Luciano: Supervision. W.S. Anderson: Validation, Supervision. J.H. Siewerdsen: Supervision, Writing –
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This research was supported by NIH Grant U01-NS-107133 and academic-industry partnership with Medtronic Inc. (Littleton, MA).
References (56)
- et al.
Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain
Med. Image Anal.
(2008) - et al.
Cross contrast multi-channel image registration using image synthesis for MR brain images
Med. Image Anal.
(2017) - et al.
A deep learning framework for unsupervised affine and deformable image registration
Med. Image Anal.
(2019) - et al.
BIRNet: brain image registration using dual-supervised fully convolutional networks
Med. Image Anal.
(2019) - et al.
3D Slicer as an image computing platform for the quantitative imaging network
Magn. Reson. Imaging
(2012) - et al.
Weakly-supervised convolutional neural networks for multimodal image registration
Med. Image Anal.
(2018) - et al.
Evaluation of 14 nonlinear deformation algorithms applied to human brain MRI registration
Neuroimage
(2009) - et al.
Robust whole-brain segmentation: application to traumatic brain injury
Med. Image Anal.
(2015) - et al.
Exploring uncertainty measures in deep networks for Multiple sclerosis lesion detection and segmentation
Med. Image Anal.
(2020) - et al.
Neuroendoscopic colloid cyst resection: a case cohort with follow-up and patient satisfaction
World Neurosurg.
(2014)
Quicksilver: fast predictive image registration - a deep learning approach
Neuroimage
A log-euclidean framework for statistics on diffeomorphisms
Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics
VoxelMorph: a learning framework for deformable medical image registration
IEEE Trans. Med. Imaging
Multimodal 3D medical image registration guided by shape encoder–decoder networks
Int. J. Comput. Assist. Radiol. Surg.
Deep learning based inter-modality image registration supervised by intra-modality similarity
MLMI 2018: Machine Learning in Medical Imaging
Deformable image registration using a cue-aware deep regression network
IEEE Trans. Biomed. Eng.
Deep group-wise registration for multi-spectral images from fundus images
IEEE Access
Unsupervised learning for fast probabilistic diffeomorphic registration
EVolution: an edge-based variational method for non-rigid multi-modal image registration
Phys. Med. Biol.
Pulmonary CT registration through supervised learning with convolutional neural networks
IEEE Trans. Med. Imaging
Deep brain stimulation in Parkinson-s disease
Ther. Adv. Neurol. Disord.
Multi-Modal Image Registration With Unsupervised Deep Learning
Deformable MR-CT image registration using an unsupervised synthesis and registration network for neuro-endoscopic surgery
Medical Imaging 2021: Image-Guided Procedures
Learning deep similarity metric for 3D MR–TRUS image registration
Int. J. Comput. Assist. Radiol. Surg.
Difficulty-aware hierarchical convolutional neural networks for deformable registration of Brain MR Images
Med. Image Anal.
Image-to-image translation with conditional adversarial networks
Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
Cited by (22)
Deformable registration of preoperative MR and intraoperative long-length tomosynthesis images for guidance of spine surgery via image synthesis
2024, Computerized Medical Imaging and GraphicsReal-time motion management in MRI-guided radiotherapy: Current status and AI-enabled prospects
2024, Radiotherapy and OncologyFew-shot multi-modal registration with mono-modal knowledge transfer
2023, Biomedical Signal Processing and ControlNCCT-CECT image synthesizers and their application to pulmonary vessel segmentation
2023, Computer Methods and Programs in BiomedicineQACL: Quartet attention aware closed-loop learning for abdominal MR-to-CT synthesis via simultaneous registration
2023, Medical Image AnalysisCitation Excerpt :To this end, if sufficient paired abdominal MR-CT images are available, it would be more valuable to overcome the challenges in the supervised mode. Recently, some innovative studies have used the MR-to-CT synthesis to convert the MR-CT registration into pCT-CT registration (Han et al., 2022; Cao et al., 2017; Fu et al., 2020a; McKenzie et al., 2020; Wei et al., 2020) for neurosurgical guidance. For example, Wei et al. (2020) first used a CycleGAN model with a mutual information constraint to generate the pCT images, and then the registration of MR and intra-procedural CT images was carried out using a classical tool of the ANTS software1 and an unsupervised registration network.
CDFRegNet: A cross-domain fusion registration network for CT-to-CBCT image registration
2022, Computer Methods and Programs in BiomedicineCitation Excerpt :However, Liang et al. [34] only synthesized images in a single direction (CBCT-to-CT), and the inverse direction was omitted. Han et al. [36] proposed probabilistic CycleGAN for multi-modal diffeomorphic registration, in which bi-direction image synthesis and dual-channel registration network were employed. These domain-translation-based methods did reduce the difficulty of CT-CBCT registration by image synthesis, as the intensity of the synthetic CT is more similar to CT than CBCT.