1 Introduction

A virtual reality (VR) simulation uses computer-generated elements to create specific environments, which can be used in medical education. VR system consists of output tools (vision, hearing, tactile and power transmitter), input devices (mouse, chaser, gloves, etc.), a graphical manufacturing system of a virtual environment, as well as an information software. In a virtual environment, all the features of an activity such as duration, severity and a type of feedback can be adapted, depending on a type of treatment and individuals’ abilities (Weiss et al. 2003; Rizzo and Kim 2005).

In addition, individuals can see their motor results and correct them if necessary. In 2010, the American Board of Internal Medicine (ABIM) advocated the use of simulation tools before performing invasive hemodynamic monitoring, mechanical ventilation and standardized educational intervention (ABIM 2010). Also, the European Commission recommends using VR support in education in “10 trends transforming education” report from 2017 (The European Political Strategy Center 2017). The advantages of VR simulators over traditional training simulators or other educational methodologies are numerous. Perhaps, the most significant advantage is the ability to objectively monitor and assess a trainee, providing a basis for formative and summative metrics (Wong et al.2014; Zirkle et al. 2007; Reddy-Kolanu et al. 2011; Wiet et al. 2012; Francis et al. 2012; Khemani et al. 2012; Kerwin et al. 2012).

VR systems have an important application in surgery. They can help to increase educational outcomes over traditional methods (Rafiq et al. 2008) providing a more flexible way of teaching. VR helps to determine the level of competence for surgery before the procedure, in order to provide repetitive practice in a controlled environment, self-directed learning and proved construct validity. The additional benefit is performing VR surgery under an experienced surgeon’s supervision (Zhao et al. 2011a, b). VR training has also other advantages such as a reduction in both training frequency (Eldred-Evans et al. 2013; Johnston et al. 2013; Khan et al. 2014; Mulla et al. 2012; Oestergaard et al. 2012; Pahuta et al. 2012; Schroedl et al. 2012) and the duration of surgery in real-world environments (Johnston et al. 2013; Nickel et al. 2014, 2015). Furthermore, it has a positive psychological effect on learners (Johnston et al. 2013).

Among many medical fields where VR systems have proven to be effective, temporal bone and middle ear surgeries are worth emphasizing (Kashikar et al. 2019; Ioannou et al. 2017). The temporal bone constitutes a vital part of the lateral skull base, and microsurgery in this area is difficult and requires compound competencies in several domains. Perfect knowledge on the complex anatomy of this area is indispensable, with many crucial anatomical structures, especially: the facial nerve, the dura, the sigmoid sinus, the carotid artery, the chorda tympani and the ossicles. Surgical technical skills are extremely demanding, most notably handling of multiple instruments (Andersen et al. 2015) such as an operating microscope, a drill, and a suction/ irrigation. Traditionally, an important part of temporal bone surgery training was cadaveric dissections, however, access to temporal bone specimens is significantly limited nowadays. Therefore, in many departments (also in our clinic), knowledge of middle ear surgery is mainly based on theoretical courses and assisting during a surgery in operating room, and access to cadaveric dissections of temporal bone is insufficient. Temporal bone VR simulator training can play an important role in filling this gap. Nowadays, numerous examples of VR simulators (commercial and research prototypes) for this medical field can be listed (Morris et al. 2006; Zirkle et al. 2007; O’Leary et al. 2008; Wiet et al. 2009; Sorensen et al. 2009). VR may be an alternative to cadaveric temporal bone surgical dissection courses or may be used additionally (Andersen et al. 2015). The important part of residency training in temporal bone surgery is to avoid drilling holes and violating vital structures in the microarea of the skull base; thus, the following skills are obligatory to gain: (1) adequate sharpening and (2) complete removal of cells in the sinodural angle; (3) sufficient exposure of the tegmen tympani; (4) not drilling into the ossicles; (5) not drilling holes in the external auditory canal wall; and (6) identifying the vertical part of the facial nerve (Andersen et al. 2015).

We have chosen to focus on antromastoidectomy training because it is one of the basic procedures which ENT residents have to learn and practice in order to gain confidence. This direction of education is also underlined by others (Piromchai 2014). The question we have asked is, if the application of VR training plays an important role in medical education of young doctors, and if such training should be introduced into the national ENT specialization program. Therefore, the aim of the study was to assess a VR temporal bone surgery simulator (based on the Geomagic Touch Haptic device from 3D System) in an antromastoidectomy simulation.

2 Material and methods

2.1 Study design

The research was designed as a prospective study, with three sessions of VR simulation training. The group of four ENT specialists experienced in head and neck oncology or rhinology but unexperienced in otosurgery, and 11 otorhinolaryngology residents with no previous experience in temporal bone surgery performed a series of virtual dissections of a VR temporal bone model. The training session took place after the pre-acquaintance and demonstrating session with VR system. The average age of all the participants (ENT specialists and residents) was 27.5 years old (min.22, max. 42 years old). Additionally, two experienced otosurgeons (with more than 10-year experience and more than 1000 performed middle ear surgeries) participated in the study as supervisors for all the participants.

2.2 VR system

A VR temporal bone surgery simulator, composed of the Geomagic Touch Haptic Device from 3D System, a MIDI controller from KORG, NVidia 3D glasses and a PC with software created by the team from the University of Melbourne, was given to ear surgeons as a training platform (Fig. 1). The system was validated for its face and content validity (O’Leary et al. 2008; Zhao et al. 2011a, b; Zhao et al. 2011a, b). Using a simulator, a surgeon can practice ear operations such as antromastoidectomy and more complex middle ear surgery, and they can prepare the approach to cochlear implantation. Major anatomical structures which must be identified without injuring them during surgery, such as the facial nerve, the chorda tympani, the sigmoid sinus, the dura or the ossicles, are represented in the virtual temporal bone. Moreover, the device has an option to signal damage to the critical elements.

Fig.1
figure 1

VR temporal bone surgery simulator. a 3D glasses, b haptic device, c MIDI controller

The data concerning the temporal bone were derived from microcomputed tomography with a voxel resolution of 96 μm. Anatomical structures were segmented and rendered in 3D. The particular anatomical structures were presented as follows: the facial nerve, the chorda tympani, the ossicles, the cochlea, the semicircular canals, the dura mater, the stapedius tendon, the round window membrane and the sigmoid sinus.

The virtual temporal bone is displayed on a computer screen, and a surgeon interacts with it through a haptic device which is represented on the screen as a surgical drill. The haptic device is represented in the system as a virtual drill, and haptic feedback is provided to the user when the drill interacts with the operating space (Wijewickrema S. et al. 2018) and it provides force feedback in three dimensions. A 3D emitter and 3D glasses allow the user to see the stereoscopic view of the virtual scene. A MIDI controller is used as an input device which allows a surgeon to change settings such as magnification level, the burr size and a type (diamond/cutting). A keyboard and a mouse also act as input devices for this simulator. The user can choose among many variants of treatments. In this study, antromastoidectomy without other disabilities was selected in all the simulations.

2.3 General study scheme

As the first step, all the participants were trained on how to use the system, and they had a demonstrating session with an experienced clinical technician. Then, every participant performed three virtual antromastoidectomies under the experts’ supervision, separated by 4–5-week breaks. This was based on the observations described by Andersen et al. (2016), who found that the mastoidectomy skills acquired under time-distributed practice conditions were retained better than skills acquired under massed practice conditions. At the end of each session, both experts supervised the correctness of the performance of simulated surgery, assigning positive points for each correctly performed step and negative points for each mistake. The performance was then discussed with a participant.

To demonstrate no significant differences in assessment between supervisors, we decided to check Inter-Rater Agreement.

There was significant agreement between supervisors for all the sessions, using the two—way random eect models, p < 0.001.

After each session, all the participants were also asked to fill in the questionnaire concerning their impression of a simulation by VR system.

2.4 The evaluation of performance—scoring system

The evaluation of every simulation (total score) was based on the duration of a VR session, the quality of performance (positive points) and the number of mistakes (negative points). Every simulation was assessed by both supervisors, and a mean of two ratings was calculated. The scoring system was prepared by the experienced surgeons supervising the study.

The duration of performance was limited to 40 min, and additional 2 points were given for every 5 min of shorter performance, maximally 12 points for the procedure performed in up to 10 min.

The quality of performance was scored with positive points (0–5) for each of the following parameters: the shape of antromastoidectomy, the thickness of the posterior bony wall of the external auditory canal, the visualization of the middle fossa dura, the visualization of the sigmoid sinus, the visualization of the lateral semicircular canal and the visualization of the incus.

Mistakes were scored with negative points in the following way: (1) damage to the posterior wall of the external auditory canal—5 points, (2) damage to the chorda tympani—5 points, (3) damage to the middle cranial fossa dura—10 points, (4) damage to the sigmoid sinus—10 points, (5) damage to the inner ear (whole to the labyrinth)—20 points, (6) damage to the facial nerve—20 points.

2.5 The evaluation of VR system—questionnaire

The questionnaire consisted of 12 specific questions focusing on users’ individual impression. The questionnaire was created by clinical technicians in cooperation with the experienced surgeons supervising the study. Specific questions included issues such as the intuitive operation of the device, the ease of drill manipulation and participants’ self-confidence (in the context of surgical skills) after each simulation. There were three types of questions:

  • Type A—scale questions—the participants had to assess some aspects of the system, scoring from 0 to 10 points.

  • Type B—closed-ended questions—yes/no questions or questions with limited number of answers concerning: the evaluation of the damage-signaling system, the assessment of training during subsequent sessions, the trainees’ subjective assessment of their self-confidence after the following sessions, and their subjective opinion about the inclusion of VR training in the routine ENT training program.

  • Type C—comments – additional space to share one’s own remarks.

2.6 Statistical evaluation

Statistical analysis was carried out by means of the Statistica software, version 13 (the Statsoft Poland). The assessment of the normality of our data was based on the results of the Shapiro–Wilk test and on the visual assessment of histograms. Due to abnormal data distribution, the average time to complete the tasks, as well as the supervisors’ scores between the sessions, was compared with the ANOVA Friedman. To identify significant differences between the results, post hoc for the Friedman test was used (macro, available with the Statistica software, which performs nonparametric multiple comparisons; the test compares the absolute value of differences for all pairs with a critical value which is determined using normal approximation with suitable adjustment of alpha to take multiple comparisons into account). The comparison of the results of the median value for the participants’ subjective questionnaire (scale—question) was made between groups of participants by means of the U Mann–Whitney test. All the tests were based on α = 0.05.

Due to a limited number of participants, the results were presented for the whole group. The results for both groups of participants were presented separately only in case of statistically significant differences (“scale question” results).

3 Results

3.1 The performance of virtual antromastoidectomy

3.1.1 Time

The average time to complete the tasks decreased from 33 min during the first session to 16.7 min during the last session.

The descriptive statistics of the length of consecutive training sessions is presented in Table 1. The ANOVA Friedman analysis was performed to confirm, that the mean time of sessions was not equal (p = 0.0005). The post hoc test for Friedman analysis was performed to identify that the mean time of session 1 and 3 is statistically significant at the level α = 0,05.

Table 1 The summary of the duration of consecutive trainings

3.1.2 The quality of performance (positive points)

The median value of positive points collected by the study participants for particular steps of antromastoidectomy in particular sessions is presented in Fig. 2. The statistically significant difference was confirmed by the ANOVA Friedman analysis and the post hoc tests for the following steps of antromastoidectomy between sessions 1 and 3: the correct visualization of the middle fossa dura (without exposing the dura); p = 0.0130 and for the correct visualization of the lateral semicircular canal (without touching it with a drill); p = 0.0024.

Fig. 2
figure 2

The median value (with interquartile range; Q1–Q3) of positive points collected by all the participants for particular steps of antromastoidectomy in particular sessions

3.1.3 Mistakes (negative points)

The sum of all failures per each session separately is presented in Fig. 3. There is a clear decreasing trend—connected with a learning curve. However, there were still some crucial mistakes observed in the third session, which means that VR training program should be composed of a higher number of sessions.

Fig. 3
figure 3

The total number and types of failures for all the participants in particular sessions

3.1.4 Total score

The total scores results for each session are presented as a learning curve measured by the value of points collected in particular sessions. It is shown in Fig. 4, where the mean value with min and max range is indicated. A significant difference between sessions 1 and 3 (p = 0.00320; ANOVA Friedman test and post hoc analysis) can be observed.

Fig. 4
figure 4

Learning curve for all the participants of the study, measured by the value of the points they had collected in consecutive sessions

3.2 The evaluation of VR system

3.2.1 Scale questions

The median value for “scale—questions” in both groups (ENT specialists and residents) is presented in Fig. 5.

Fig. 5
figure 5

The median value for “scale–question” in both groups. The statistically significant difference with p < 0.05 was found between both groups only in the assessment of accuracy of the temporal bone projection (p = 0.0202; U Mann–Whitney test)

3.2.2 Closed-ended questions

Sixty percent of participants answered that signaling damage to the critical elements was good (40%—sufficient), and 67% of trainees assessed that they had made a progress in consecutive sessions. After the first session of virtual antromastoidectomy, 50% of participants evaluated that they felt more confident about their skills in comparison with the initial situation. After the third session, 100% of participants indicated higher self-confidence in relation to their own surgical skills. Also, all the participants indicated that VR training should be included in a routine educational program for medical students and young doctors.

3.2.3 Participants comments

Among the comments and remarks after all the sessions, the residents indicated:

  • problems with putting a drill in the expected place—this impression disappeared during consecutive sessions,

  • problems with a sense of depth in the presented 3D model—this issue was indicated only after the first session of a simulation.

The ENT specialists, in turn, pointed out that using CT scans of real patients in a simulation before surgery would be very valuable not only for trainees but also for experienced surgeons.

4 Discussion

The novelty of the presented study is the double approach to VR model assessment. First part enables possibly the most effective assessment method of advantages in terms of participants’ skills improvement. All the participants were trained on how to use VR system, and performed three virtual antromastoidectomies under the experts’ supervision, separated by 4–5-week breaks. The duration of each session, the quality of performance, mistakes and VR system subjective assessment were the main outcome measures in our research. Correct performance of simulated surgery was supervised by the experts assigning positive points for each correctly performed step and negative points for each mistake. The second part was the VR system assessment. After each session, all the participants were asked to fill in the questionnaire concerning their impression of a simulation by means of VR system.

4.1 Virtual performance

Although time should not be a crucial factor in surgery, shortening the duration of procedure might measure the level of fluency and skills acquisition. Therefore, we used “time” as a parameter to assess virtual antromastoidectomy. Our results show that the time needed to perform virtual antromastoidectomy procedure decreased by 50% from the first to the last session. These results correspond with the results of Piromchai (2014) who indicates that the virtual reality group performance was significantly better with shorter time taken to complete endoscopic sinus surgery training in comparison with conventional training.

Taking into account particular steps of antromastoidectomy procedure, our results show that the total number of points was significantly higher after the third session, in comparison with the first one. The trainees achieved better results in the third session, due to experience gained during the previous two VR simulations of antromastoidectomy.

One of the first studies assessing the efficacy of using a VR temporal bone simulation in otolaryngology residents training was performed in 2011 (Al-Noury 2012), and although the authors used different VR devices, the results confirmed that VR had been a very helpful tool in surgical education. Other authors emphasize some important benefits of VR such as a decrease in the number of mistakes which are made, more successful surgeries (Nickel et al. 2015), better learning of anatomical positions and better understanding of the exterior and interior space relationships between the organs (Pahuta et al. 2012). Nevertheless, one of the latest studies presented in 2019 on the VOXEL-MAN Tempo® surgical simulator showed some ambiguous findings namely performance on radiological testing increased significantly after VR training, however, surgical results on cadaveric specimens were not correlated to surgical simulation parameters. The conclusion was drawn that trainees should integrate a VR tool within their learning of temporal bone’s radiological and surgical anatomy (Rogister et al. 2019).

4.2 System evaluation

In the presented research, VR system assessment was based on 12 specific questions asked in the questionnaire. We found that positive answers and opinions in both groups of study participants constitute the majority, however, one statistically significant difference between the ENT specialists and group of residents was observed namely in the perception of accuracy of temporal bone projection. The specialists perceived temporal bone mapping presented in VR system as accurate, while the residents perceived it as significantly less accurate than the specialists. This difference pointed at the factor connected directly with participants’ age and level of experience. In ENT specialists’ opinion, the accuracy of temporal bone presentation in VR system was good enough to perform antromastoidectomy and did not differ from the real temporal bone.

We can also observe differences in the assessment of the crucial structures damage signaled by the residents and the specialists. The median of damage-signaling rating indicated by the specialists was between “sufficient” and “good,” while the median of rating indicated by the residents was “good.” It shows more critical approach in the assessment of signaling damage presented by the specialists.

Some studies indicated the improvement of teamwork in a medical team (Fernandez et al. 2013) and the increase in self-confidence of learners using VR, compared to other groups (Johnston et al. 2013). In the present study, all the participants declared greater self-confidence after a series of VR sessions. The participants also mentioned that it would be very valuable to use real patients’ models in a simulation before surgery. We did not use a self-assessment strategy to evaluate a virtual simulation of antromastoidectomy, although this type of strategy had been used by others (Andersen et al. 2019). They reported, and we agree, that structured self-assessment was not sufficient itself to obtain the learning curve plateau, and additional support for deliberate practice was needed for continued skills development.

4.3 Advantages and limitations

The use of the same virtual model for all the participants, both residents and specialists, and in all the sessions is both the novelty and the advantage of the designed study. This approach intentionally enables possibly the most effective assessment of advantages in terms of participants’ skills improvement. We also propose a new scale with “positive” and “negative” points, which depends on the “range” of an error. Other known scales were usually based on binary system (error occurs or not). Our scale is based on medical consequences of each complication for a patient. Damage to the facial nerve and damage to the inner ear are extremely serious complications of temporal bone surgery, much more serious than damage to the posterior wall of the external auditory canal. The consequences for a patient are crucial in surgery, so we decided that facial nerve damage and inner ear damage should have the most negative impact on our scoring among all the other adverse situations.

We are also conscious that our study has a limitation which is a small number of sessions, and it should be extended, in order to observe in which session errors would not occur.

5 Conclusion

VR training allowed the participants to significantly improve virtual antromastoidectomy performance, which was showed by the decreased duration of surgery and number of mistakes, and the increased number of received positive points for the quality of performance. VR training provides a structured, safe and supportive environment to familiarize oneself with complex anatomy of the ear, practice surgical skills and is also an expected form of education according to all the participants.