Background/Introduction

The number of endovascularly treated patients suffering from stroke or aneurysms has steadily increased over the last few years [1]. In 2015, five large randomized stroke trials established the superiority of mechanical thrombectomy in combination with intravenous recombinant tissue plasminogen activator (i.v. rt-PA) over i.v. rt-PA alone [2,3,4,5,6]. Since then, the number of treated patients has risen continuously. Moreover, the time window has been recently broadened [7, 8], which will further increase the number of patients eligible for interventional treatment. Although mechanical thrombectomy is performed in more and more patients at centers with great expertise, demand is increasing for an on-call service around the clock at multiple smaller centers. Hence, a sufficient number of interventional neuroradiologists must be trained at a high level. Using high-fidelity simulators offers the possibility of structured training without endangering patients [9, 10]. Although the first prototypes of simulators specifically for neuroradiological applications were developed many years ago [11, 12], they are not yet widely used in neuroradiology. After proof of face and content validity, based on expert opinion [13] the next crucial step to establish such a new approach includes the demonstration of its construct validity, i.e. beginners and advanced users can be distinguished via the parameters measured by the simulator. Only when this prerequisite is met should such a simulator be used for training interventionalists. Concerning interventional procedures, construct validity was previously shown for renal [14, 15] and cardiac procedures [16, 17] as well as for carotid stenting [18, 19]. After the validity or neuroangiographies of another make, the ANGIO Mentor Express (Simbionix, Cleveland, OH, USA) has already been proven [20], the present study aimed at establishing the construct validity of the Mentice VIST simulator; however, construct validity is not directly related to the effectiveness of a simulator. To be used as an effective training tool in practice, improvements in psychomotor skills should be measurable [21, 22]. Therefore, the second aim of the study was to quantify improvements of different quality measures of a diagnostic neuroangiography, resulting from repeated simulator training. Of interest was, in addition to an already confirmed training effect for beginners and intermediates [10, 20, 23, 24], whether this impact could also be demonstrated for experts.

Material and Methods

Hardware

All neuroangiographic simulations were performed with the vascular intervention system trainer VIST C from Mentice (Mentice AB, Gothenburg, Sweden), integrated in the VIST LAB. This is a stationary unit whose surface is designed in the form of a patient silhouette (Fig. 1). The VIST LAB consists of a control screen with a touch screen function for selecting scenarios and materials, plus three additional monitors. On two of them, the examiner can follow the live image on two planes and on the third monitor can scroll through the recorded series. A laptop and the simulator itself are positioned under the surface. The only visible detail of the simulator is an insertion sheath at the height of the manikin’s groin, which serves as the access point to the simulator for all materials. Pushing, pulling, and rotating movements of the introduced wires and catheters are detected by three sensors, which can also generate resistance in the sense of force feedback. These movements are transferred into a virtual patient anatomy, consisting of arteries and bones, and displayed on the screens in real time. A foot switch with two pedals, a panel with several options for controlling the virtual C-arms and other settings (shutter, zoom, etc.), and a tube with a syringe for the injection of air, simulating the injection of contrast medium, are also connected to the simulator.

Fig. 1
figure 1

Simulation set-up with VIST LAB and eye tracking camera (heart rate belt covered by shirt)

Parameters stored by the device are total duration of the procedure, total fluoroscopy time, number of series, total series time, amount of contrast agent, and number of material changes.

Endovascular Devices

Real angiographic materials, such as a hydrophilic wire (0.035″ Glidewire, Terumo, Somerset, NJ, USA) and a 5F vertebralis diagnostic catheter (VER, Cordis, Santa Clara, CA, USA) were used. All terminal curvatures had been cut off as required by the manufacturer of the simulator.

Software

The source data of magnetic resonance (MR) angiographies of real patients with different anatomical types of the aortic arch were segmented semi-automatically by using IntelliSpacePortal (Philips, Best, The Netherlands). The resulting 3D model was imported to the simulator as a Stereolithography (STL) file. The integrated Case-it software module (Mentice AB, Gothenburg, Sweden) connected the model of the aortic arch and the supra-aortic branches to a template of the descending aorta and the iliac arteries down to the superficial femoral artery.

Accessories

An eye-tracking camera (EyeSeeCam Sci, EyeSeeTec GmbH, München, Germany) was used to record the viewing direction of the test persons. The pulse rate was recorded with a heart rate belt (Zephyr™ BioModule™ Sensor, BioHarnessTM 3, Medtronic, Dublin, Ireland). Free MATLAB®-based (MathWorks®, Natick, MA, USA) ARTiiFACT software [25] was used to evaluate the mean heart rate. The whole set-up is presented in Fig. 1.

Test Persons

Participants were recruited from the radiology and the neuroradiology departments and divided into two groups. The expert group consisted of five physicians with advanced angiography experience from more than 100 cerebral angiographies each. One of them only completed the first five cases and then left the study. These data could only be included in the analysis of validity, but not in the evaluations of the training effect. The beginner group consisted of one medical student without any experience in angiography, three neuroradiological residents with little neuroangiographic experience (<30) and two radiological residents with exclusively peripheral angiographic experience.

The local ethics committee approved the study (172/14), and each participant provided informed consent before participating.

Study Design

Each participant had to complete ten cases. Every case consisted of a complete cerebral angiography in one patient, i.e., the selective probing and visualization of the internal carotid artery, external carotid artery, and vertebral artery with their respective dependent branches on both sides. The ten cases were split in two study parts, where the first part included five unknown cases and the second part included two new cases and three repeated cases from the first part. In detail, the first case was repeated in case six, the fourth case in case nine and the fifth case in case ten. All cases were classified into five levels of difficulty by a neuroradiologist not participating in the study. The classification was based on the following criteria: level 1 was easy to perform with a VER catheter without using a roadmap due to standard anatomy; level 2 was also feasible with a VER catheter, but a roadmap was required; level 3 included an arterial norm variant, i.e., the left vertebral artery originated directly from the aortic arch. In level 4, a sidewinder catheter was required to solve the task. The highest level 5 required advanced catheter maneuvers and changes, plus precise knowledge of the vessel anatomy, variants, and possible pathologies. In each part of the study there was one case of each difficulty level. The participants were informed neither about the level nor about whether the presented case was a known or unknown one.

Before the first session, a technical introduction to the simulator was given, together with a shortened exercise session on a case that was not part of the study. The participant was equipped with a heart rate belt and a head-mounted camera. The individual simulation sessions were limited to a maximum of 2h to avoid any possible fatigue effects. The operation of the simulator beyond the catheter guidance was the responsibility of a student. Each angiography was started with a hydrophilic guidewire (35° curvature) and VER as standard, but the material could be changed at any time on request. At the end of each case, the investigators filled out a questionnaire to record the subjectively perceived workload (NASA-TLX, National Aeronautics and Space Administration Task Load Index, German translation, [26]) as rating with respect to mental demands, physical demands, temporal demands, satisfaction with performance, effort, and frustration. Each rating scale ranged from 0 to 20.

Statistics

Statistical analysis was performed by descriptive and exploratory statistics. Ordinal data (number of series, number of material changes, results of NASA-TLX) and metric data (total duration, total fluoroscopy time, total series time, amount of contrast agent, mean heart rate and percentage of gaze direction lower than 30°) were compared by using a median test or a Mann-Whitney U‑test, respectively. Ordinal data are presented in median (interquartile range, IQR), metric data are presented as mean (± standard deviation, SD) if not stated differently. A p-value < 0.05 was considered as statistically significant. All analyses were performed using IBM SPSS Statistics, Version 23.0 (IBM, Armonk, NY, USA).

Results

Proof of Validation

Comparative results of the experts and beginners of all 10 cases are listed in Table 1. Total duration time, total fluoroscopy time and amount of contrast agent differed significantly. With respect to the subjective task requirements, significant differences were observed in the assessment of satisfaction and effort. Beginners were more dissatisfied and had to make greater efforts than the experts. Nevertheless, heart rate and viewing direction did not differ between experts and beginners in the cases.

Table 1 Comparison of experts and beginners in a summary of all 10 simulations (one expert only performed simulations 1–5)

Proof of Training Effectiveness

In a comparison of the first five cases with the last five cases participants improved significantly in terms of total duration, fluoroscopy time and perceived effort (total duration 31.94 min (SD 16.41 min) to 21.15 (SD 7.18 min), p < 0.001; fluoroscopy time 18.15 min (SD 8.55 min) to 13.42 min (SD 4.87 min), p = 0.003; effort [median] 15 (IQR 7) to 13 (IQR 7), p = 0.031); however, when results are separated for experts and beginners (Table 2), then this effect remained significant only in beginners. Fig. 2 visualizes the decrease of fluoroscopy time during the sequence of the training, although the degree of difficulty increased from 1 to 5 and 6 to 10 respectively. In addition, the beginners became more satisfied with their performance during the course of training. Similar differences were found when comparing unknown cases with known cases, where total duration time and fluoroscopy time differed significantly and participants were more satisfied with their performance (total duration 35.91 min (SD 16.03 min) to 22.73 min (SD 6.37 min), p < 0.001; fluoroscopy time 20.66 min (SD 7.90 min) to 14.39 min (SD 5.19 min), p = 0.003; satisfaction with performance [median] 15.5 (IQR 5) to 16 (IQR 3), p = 0.022).

Table 2 Comparison of experts and beginners of simulations 1–5 and 6–10 separately
Fig. 2
figure 2

Fluoroscopy time of beginners (red) and experts (blue) with corresponding regression line during all 10 simulations. Cases 1, 4 and 5 are repeated (in same order) in cases 6, 9 and 10 respectively

Again, these differences remained significant only for beginners (Table 3). Neither heart rate, nor viewing direction differed in experts or beginners between known and unknown cases.

Table 3 Comparison of experts and beginners regarding unknown and known cases

Discussion

In this study it was possible to prove the construct validity of a neuroangiographic simulator by demonstrating significant performance differences between beginners and experts. We were furthermore able to show the effectiveness of such a simulator by directly measuring significant improvements in psychomotor skills of beginners in cases derived from real patient anatomy.

This is in line with previous studies on different scenarios, such as carotid artery stenting [13, 18] and several infra-aortic applications [9, 17] and addresses one major concern regarding neuroangiographic simulations: In 2008, Carroll and Messenger stated that “medical simulation has made the transition from an experimental technology to the clinical world”, and that “perhaps the most pressing issue […] regarding medical simulation is validation” [27].

To prove the construct validity, we had clearly separated groups of beginners with a maximum of 30 cerebral angiographies performed and of experts with at least 100 procedures. The simulator data for all 10 procedures evidently demonstrated this subdivision. As previously confirmed for cardiac angiography [28], beginners need more time to find and examine the target vessels. Thus, total duration and fluoroscopy time differed significantly. Beginners more often produced roadmaps; accordingly, the total amount of contrast agent was significantly higher for beginners. The fact that the simulator only shows the contrast of the arteries and contains no parenchyma or the venous phase may serve as a reason for the missing difference in the duration of the series. Thus, in this parameter no difference was present between experts and beginners. Overall, these results contradict the findings of Nguyen et al. who were only able to identify the amount of contrast agent as a distinguishing feature for the experience level [20]. A possible reason could be their small number of 2 compared tasks, while our study design consisted of 10 procedures.

Observations of gaze direction, blink frequency, pupil size and dwell time are recognized means of examining attention and cognitive stress [29]. Based on such data Richstone et al. were able to distinguish unequivocally between beginners and experts in surgical laparoscopy [30], but for endovascular cardiac interventions, Currie et al. found hardly any differences [31]. The assumption that experts would turn their gaze less often away from the X‑ray screen could therefore not be confirmed by them. The present study focused on the direction of each subject’s gaze; no differences between the two groups could be observed.

Heart rate was recorded, as it was assumed that it would differ between various levels of workload and thus between beginners and experts according to the effort made. In previous work on anesthesiologists, Martin et al. as well as Weinger et al. found significant differences in several heart rate parameters, including mean heart rate, during different phases of an anesthetic procedure [32, 42]. In a later work of the first group on pre-hospital emergency medicine, heart rate variability discriminated better between different levels of workload compared to the mean heart rate [33]. Currie et al. also measured numerous parameters such as heart rate variability, electrodermal activity and skin temperature in physicians undertaking cardiac endovascular procedures but could not find differences between experience levels [31]. Hence, mean heart rate seems to be an unreliable predictor, matching the results of our study, where it was not suitable to distinguish between beginners and experts.

The NASA-TLX is a frequently used tool to assess subjective workload, especially in anesthesia and the field of emergency care [34], but also in other areas, e.g. radiotherapy [35] or flight simulation [36]. In comparison with the previously mentioned complex technical tools, the NASA-TLX offers an easy and fast method for recording the workload. The modified version Raw-TLX (RTLX), without the weighting process of the subscales, is even easier to use [37]. In a study on surgical robotics, differences were found in NASA RTLX by examiners with different experience levels [38], just as in our participants. Apparently, there are no data on whether NASA RTLX changes differently through training of beginners compared to experts. The satisfaction of our participants only increased when the other measurable parameters also improved, which was only true for beginners. We therefore assume that the simulator is well suitable to give adequate feedback on one’s own performance.

An increase in performance with respect to the two essential parameters of total duration and fluoroscopy time has been demonstrated in our study for beginners and confirmed the results of Spiotta et al. and Zaika et al. [10, 24]. Among our participants, however, this effect could not be observed among experts. Thus, not every experience level benefits from simulator training that focuses exclusively on diagnostic angiography. This supports the assumption that in the best case skills are learned that an expert already possesses. For beginners, the training effect can be objectively read from the metrics of the simulator, as known for peripheral endovascular procedures [14, 39]. In addition, we have shown that repeated exercises of the same cases are helpful for beginners not only to increase their metric values, but also to increase their own satisfaction. Spiotta et al. also noted an increase in confidence in the acquisition of skills, both in terms of knowledge of the anatomy and the technique of vessel selection [23]. Not least for this reason, several centers have now begun to implement a structured milestone-based curriculum and propose to integrate simulation training into formal neuroendovascular training [23, 40]

When demonstrating a training effect, it is always difficult to distinguish the extent to which a real gain in specific psychomotor skills overlaps with increasing familiarity with the operation of the simulator [39]. An improvement merely by habituation would be recognizable in all subjects; however, since only the group of beginners improved their performance relevantly, this effect seems to be negligible here.

Notably, in the RTLX evaluation, the degree of frustration of the experts slightly increased during training, whereas it decreased among the beginners (both not significant). Comments from the participants showed that individual experts were disturbed by the differences between simulation and reality, whereas beginners were not. The participants of a simulator training should therefore be informed in advance with respect to the differences that can be expected in the behavior of the simulator in relation to reality. Otherwise, they approach the training with a variety of ideas and demands, and frustration and anger can easily arise [1].

Limitations

Construct validity should distinguish not only between beginners and experts, but ideally also between various levels of experience [16]. In our study, the small number of test persons prevented further division of study participants. Also, the number of cases, in particular of repeated cases was limited in this study. A higher number of cases might also show differences in expert neuroradiologists.

Conclusion

Construct validity of a high-tech simulator could be demonstrated for diagnostic neuroangiography and especially beginners showed a measurable training effect through repeated practice. Further studies should demonstrate the benefit of such simulation training for the patient.