Introduction

The literature is controversial regarding if and to what extent the emergence of labial gingival recessions (LGR) can be considered a result of orthodontic treatment (Tx). Since the 1970s, it has been discussed whether the development of LGR should be regarded as the result of a labial movement of teeth, which might predispose them to bone dehiscences and periodontal attachment loss [1, 2]. Until today, however, no respective consensus has been reached.

The available systematic reviews report both little to no clinically relevant effect [3, 4] or small negative effects [5] of orthodontic Tx on periodontal health. Similarly, various investigations have determined a higher prevalence for LGR after orthodontic Tx compared to untreated controls [6, 7]. Particularly the proclination of lower incisors has been regarded as a risk factor [8,9,10]. Recent studies, on the other hand, could not confirm this presumption [11, 12].

Herbst appliance Tx is known to often result in lower incisor proclination [13,14,15]. Alveolar bone loss on the buccal surface of the lower incisors during class II Herbst Tx in CBCT amounts to an average of ≤ 0.2 mm [16] and is unpredictable on the individual level—even when adding skeletal anchorage [17]. Nevertheless, neither a clinically significant adverse short- or long-term impact of Herbst appliance Tx on periodontal health [13, 18,19,20] nor an association between the amount of proclination and the prevalence/incidence of LGR [15] have been determined so far.

Unfortunately, the majority of available studies only looked at the condition of the lower incisors and evaluated rather selected patient cohorts fulfilling specific, fairly strict inclusion criteria. While some more representative data exist for class II:1 Tx [18], no investigation has specifically looked at class II:2 malocclusions so far.

Aim

It was the objective of the present investigation to assess the prevalence, incidence, and magnitude of LGR on all permanent teeth after Herbst-Multibracket appliance (MBA) Tx in a large class II:2 cohort of consecutive patients, unselected in terms of Tx outcome.

Material and methods

The archive of the Department of Orthodontics, University of Giessen, Germany was screened for all patients who had undergone Herbst-MBA Tx since establishing this Tx approach in 1986. The records were evaluated regarding the following inclusion criteria and in case of fulfilment the respective patients were consecutively included:

  • Class II:2

  • Herbst-MBA Tx completed

  • Study casts from before Tx (T0) and/or ≥ 24 months after Herbst-MBA Tx and retention (T1) available

The Tx protocol included a Herbst phase using a cast-splint Herbst appliance (Dentaurum GmbH, Germany) and a subsequent MBA phase where two different types of labial straight-wire MBAs including class II elastics were applied. In approximately one-third of the patients, an initial short phase of fixed appliance Tx for upper incisor proclination was undertaken to enable Herbst appliance insertion and adjustment in an incisal edge-to-edge relationship. In addition, approximately one-tenth of the patients had received maxillary transverse expansion using fixed or removable appliances before starting Herbst Tx; in the remaining patients, Tx was started directly with the insertion of the Herbst appliance.

All study casts were visually inspected in terms of accuracy and excluded in case of “altered” looking gingival conditions preventing from reliable measurements like for example the appearance of marked swelling, air blows or other artefacts. The study casts from T0 and T1 were evaluated for LGR on all fully erupted teeth except the wisdom teeth. The distance between the cemento-enamel junction and the deepest point of the gingival margin was assessed and—in case of a positive value—defined as LGR. These measurements were performed using a manual calliper (HSL247–52, Karl Hammacher GmbH, Solingen, Germany) and were rounded to the nearest 0.5 mm. The mean value and standard deviation as well as the minimum, maximum and median values were assessed separately for each tooth to allow for a most comprehensive comparison with the literature.

One single operator (--) performed all measurements. To determine observer reliability, the study casts of 20 consecutive patients were assessed twice with a time interval of at least 2 weeks between the two measurements. The mean method error was calculated as 0.07 ± 0.08 using the Dahlberg Formula and Kendall’s Tau correlation coefficient as 0.844 revealing a high consistency [21].

LGR prevalence (%) and LGR magnitude (mm) were assessed for the complete patient sample at T0 and T1; LGR incidence (%) during T0-T1 was determined and statistically analysed exclusively for patients with “complete” records, i.e. study casts available from T0 and T1.

IBM® SPSS® Statistics Version 23 (IBM Corporation, Armonk, NY, USA) software as well as Microsoft Excel 2010 were used for the statistical analyses. No sample size calculation was performed because of the explorative character of the study. But to determine a possible trend, the pre-Tx and post-retention data of patients with “complete” records were compared separately for each tooth (T0 vs. T1) regarding LGR prevalence (McNemar test) and LGR magnitude (Wilcoxon signed-rank test). The significance level was p < 0.05.

Results

The total cohort of class II:2 patients who had completed Herbst-MBA Tx since 1986 comprised of 177 patients (Fig. 1). As no significant difference existed between those patients with “complete” records (ratable study casts from T0 and T1; n = 94) respectively “incomplete” records (ratable study casts from T0/T1 only; n = 79/n = 0), the cohort with “complete” records was considered representative for the entire sample (Supplementary Tables 1 and 2). So, exclusively, the data of this group are presented in detail. The mean active Tx duration was 22.6 ± 7.2 months and the mean post-Tx observation period 29.1 ± 7.5 months. From T0 to T1, the overbite had changed from 5.3 ± 1.6 to 1.6 ± 0.8 mm and the molar relationship from 0.9 ± 0.3 cusp widths class II to 0.0 ± 0.2 (class I). For retention, upper and/or lower bonded canine-to-canine retainers, removable upper and/or lower retention plates or a combination of both were used, plus in about one-third of the patients an activator. At follow-up, a lower bonded retainer was still worn by 90% of the patients while an upper bonded/removable retainer was worn by 64%/19%.

Fig. 1
figure 1

Patient flow chart. The numbers of class II:2 patients who started/completed Herbst-MBA Tx and a follow-up observation period of ≥ 24 months are given, as well as the numbers of ratable/included pre- and post-retention study casts

Prevalence and magnitude of LGR (n = 94)

Evaluating the overall pre-Tx condition (T0), the prevalence for LGR with a magnitude ≥ 0.5 mm was 1.4% for all included 2601 teeth (Table 1); the median magnitude was 0.0 mm, and the maximum was 1.5 mm (Table 2). The upper first premolars showed the highest prevalence value of 5.3%; nevertheless, no tooth presented LGR with a magnitude ≥ 2.0 mm (Fig. 2a, b; Table 1).

Table 1 Prevalence (%) of labial gingival recession for the teeth 17–47 before Tx (T0) and after Herbst-MBA Tx plus a retention period of ≥ 24 months (T1) in 94 individuals. Labial gingival recession categorized by magnitude: none (< 0.5 mm), 0.5–< 1.0 mm, ≥ 1.0–< 2.0 mm, ≥ 2.0 mm. In addition, the p value of the statistical comparison (T0 vs. T1) is shown for the category none (< 0.5 mm)
Table 2 Magnitude (mm) of labial gingival recession for the teeth 17–47 before Tx (T0) and after Herbst-MBA Tx plus a retention period of ≥ 24 months (T1) in 94 individuals. The mean value and standard deviation as well as the median, minimum and maximum values are given. In addition, the p value of the statistical comparison (T0 vs. T1) is given for each type of tooth
Fig. 2
figure 2

Prevalence (%) of labial gingival recession for the teeth 17–47 before Tx (T0) and after Herbst-MBA Tx and a retention period of ≥ 24 months (T1) for LGR with a magnitude ≥ 0.5 mm (a)/2.0 mm (b) in 94 individuals

After Tx plus a post-Tx retention period of on average 29 months (T1), 6.7% of all assessed 2601 teeth exhibited LGR with a magnitude ≥ 0.5 mm (Table 1). The median magnitude was 0.0 mm, and the maximum was 3.0 mm (Table 2). The most frequently affected teeth were the upper first and the right second premolars as well as the lower central incisors with a prevalence of 12.1–20.2%; however, only 1.0–2.0% of the premolars and none of the incisors exhibited LGR ≥ 2.0 mm (Fig. 2a, b; Table 1).

Incidence of LGR (T0 and T1: n = 94)

From pre-Tx to post-retention (T0-T1), and thus over a total observation period of approximately 4.5 years, an overall mean LGR incidence for magnitude ≥ 0.5 mm of 5.3% was determined (Table 1). The respective incidence value for LGR ≥ 2.0 mm was 0.4%.

The highest LGR incidence values for magnitude ≥ 0.5 mm were seen for the upper right premolars and the lower central incisors: 9.9–14.9% (Fig. 1; Table 1). Comparing the pre-Tx (T0) and post-retention (T1) data, the prevalence changes were significant (p ≤ 0.05) for 10 of the 28 different teeth (Table 1) and the magnitude changes in 14 of the 28 different teeth (Table 2). The respective post-retention median/mean magnitude was 0.00/0.06 mm (Table 2).

Discussion

The present study is the first to assess the prevalence, incidence and magnitude of LGR in all teeth 17–47 during class II:2 Herbst-MBA Tx and retention.

Subjects

Study casts of all class II:2 patients who underwent Herbst-MBA Tx at one single study centre during a period of 27 years were investigated. The study design was retrospective; so, it was not possible to control all variables that might have influenced LGR development as a multifactorial occurrence. For example, the amount of mandibular advancement, periodontal morphology/susceptibility to LGR and patient compliance could not be analysed. Still, the sample was homogenous in terms of the underlying malocclusion class II:2 and the general Tx approach being non-extraction. The fact that Tx had been performed by different practitioners using two different types of straight-wire MBAs—which might have affected torque—should not interfere with the objective to evaluate the effect of Herbst-MBA Tx on the prevalence, incidence and magnitude of LGR. Due to severe gingival swelling/hyperplasia being often present upon debonding, the study casts from that occasion were not used and the measurements were confined to the post-retention study casts. In any case, the inclusion of patients was performed irrespective of Tx outcome.

Method

The distance between the cemento-enamel junction and the deepest point of the gingival margin/recession was determined on all fully erupted teeth. All these linear measurements were performed by one single investigator with high consistency (Kendall’s Tau = 0.84), the method error of 0.07 ± 0.08 was rather low. Therefore, the data can be regarded objective.

Measurements of gingival recessions performed on study casts were determined to show a high correlation with those from clinical assessment [22]. Nevertheless, gingival swelling and artefacts occurring during study cast preparation might affect the accuracy of the measurements. On the other hand, an intraobserver reliability of 0.80 to 1.00 and an interobserver agreement of 0.67–1.00 were determined in a similar investigation where pre- and post-Tx study casts were evaluated, proving good reliability [23].

Results—prevalence

The data on LGR prevalence in adolescents available in the literature are limited. The pre-Tx overall LGR prevalence of 1.4% determined from 2601 teeth is in concordance with the pre-Tx prevalence of 1.7% described after assessing study casts of 302 similarly aged orthodontic—mainly class II—patients [23]. A value of 1.1% was determined in a class II:1 sample with a total of 12,573 teeth before Herbst-MBA Tx [18]. All these values are lower than the prevalence of 5.6% determined from 100 non-orthodontic 12-year old Finns after a clinical examination [24]. The reason for this difference is unknown, but both the different assessment methods and the analysed populations’ variation might be contributing factors.

After approximately 4.5 years of Tx plus retention at age 20.2 ± 5.6 years, an overall prevalence value of 6.7% was found for LGR with a magnitude ≥ 0.5 mm in the present sample with 2601 teeth. In the literature, a rate of 20.2% was described for a sample of 302 similarly aged orthodontic (mainly class II) patients after a similar observation period [23]. For class II:1 Herbst-MBA Tx, the respective value was 5.3% after 5 years of Tx and retention [18]. Overall, LGR prevalence values in the literature range between 1.6 and 13.8% for mainly untreated samples of similar age (Table 3) [7, 24,25,26,27,28,29].

Table 3 Labial gingival recession prevalence data available in the literature for adolescents/young adults. The reference number, sample characteristics, and prevalence values (%) of comparable samples (age) are given

The finding of particularly high LGR prevalence values of 12.2–13.1% for the lower central incisors is at least partially in accordance with the literature when looking at data of orthodontically treated class II:1 samples (Herbst-MBA Tx 12–17%, mean age 19 years [18]; only Herbst Tx 15–22%, mean age 14 years [15]). Specific reports for other appliances respectively other Tx protocols do not exist. Other—according to the references—mainly untreated samples of similar age (mean 18–29 years) exhibit values between 2 and 9% [7, 27, 29]. Distinctly higher values of up to 33% were determined in a Brazilian urban population aged 14 to 29 years without information given on the history of orthodontic Tx [26]. Thus, the values determined for the lower incisors in the present sample after Herbst-MBA Tx are not distinctly higher than those for treated and untreated samples in the literature. It can therefore be assumed that Herbst-MBA Tx is not a risk factor for the development of LGR in class II:2 malocclusions.

The upper first premolars exhibited notably high LGR prevalence values of 13.1–20.2% as well. While these values are slightly higher than most available data in the literature for upper first premolars in orthodontically treated and untreated subjects of similar age ranging between 6.5 and 15.0% [7, 23, 27], a much higher prevalence value of 32.6% obtained from a sample of dental students has also been published [29]. For class II:1 patients with the identical Tx approach, the respective prevalence values of 8.0–8.5% were much lower [18]. Therefore, the cause for this high LGR prevalence in class II:2 seems not to lie in the Herbst-MBA Tx protocol but in the morphologic difference between class II:1 and class II:2 malocclusions. Class II:2 malocclusions typically feature a large maxillary apical base especially in the transverse dimension and relative to the lower jaw [30, 31]. In addition, class II:2 malocclusions often present small teeth in comparison to the well-developed jaw [32]. As a consequence, establishing a normal transverse upper to lower occlusal relationship and closing all spaces in the upper arch results in slightly palatally inclined premolars and molars, which in turn predisposes them to the development of LGR.

Results—incidence

Regarding the incidence of LGR with a magnitude ≥ 0.5 mm, an overall rate of 5.3% was found for the total observation period of approximately 4.5 years. Previous data in the literature report LGR incidence rates of 3–10% [15, 18, 33] for orthodontically treated and 8% [24] for orthodontically untreated samples.

Looking specifically at the lower central incisors, an overall incidence of 4.0–11.1% for LGR with a magnitude ≥ 0.5 mm was determined. This rate corresponds to data in the literature: 3.0% during class II:1 Herbst Tx [15], 10.4–11.4% during class II:1 Herbst-MBA Tx [18], and 7.0–10.0% during class I/II non-extraction Tx in adults [33].

Even if some articles in the literature conclude that orthodontic tooth movement might increase the risk for LGR development [6, 7, 34,35,36], the data of the current study and their comparison with the literature show that Herbst-MBA Tx cannot generally be considered a clinically relevant risk factor for LGR development. Lesions beyond average might obviously emerge in single patients, but this is probably true for any kind of orthodontic Tx, especially as LGR are induced by more than a single factor [25, 37,38,39,40].

Results—magnitude

Post-retention, the mean LGR magnitude of the present sample was 0.06 ± 0.25 mm and therefore similar as in other orthodontic patients (0.1 ± 0.1 mm, n = 222, 2.7 years post-Tx at age 14–19 years [18]; 0.1 ± 0.3 mm, n = 64, 4.6 years post-Tx at age 18–26 years [34]) and even smaller than in untreated populations (1.2/2.0 mm, untreated 20–21 years old Norwegians/18–19 years old Sri Lankans [25]).

Looking specifically at lower incisors, very little has been published so far. Nevertheless, the present magnitude (mean 0.0–0.1 mm, maximum 1.0 mm) is similar or lower than in class II:1 patients after Herbst-MBA Tx (mean 0.1–0.2 mm, maximum 4.0 mm [18]) respectively in other samples of orthodontically treated patients observed for 4–9 years where the values in the literature range between ≈ 0.6 mm [41], 0.6–1.1 mm [12] and 0.9–1.0 mm [42]. This is also true for the corresponding values determined from an orthodontically untreated sample (mean 1.0–1.2 mm, maximum 3.0 mm [29]).

Limitations

The reduced number of complete sets of study casts when compared to the pre-Tx patient sample is certainly a limitation. In addition, the retrospective study design, which includes the fact that only study casts and no data on oral hygiene during Tx were assessed, limits the reliability of the data. And finally, the study design also comes with a high risk of reporting and performance bias, which should not be neglected even if the overall LGR incidence and magnitude values are low.

Conclusion

During class II:2 correction, the mean prevalence of teeth with LGR ≥ 0.5 mm increased from 1.4% before Tx to 6.7% after 24 months of Herbst-MBA Tx plus 29 months of retention (≈ 4.5 years). The highest incidence was seen for lower central incisors and upper right premolars (11.1/14.9%). However, as the overall mean magnitude after Herbst-MBA Tx plus retention was 0.06 mm, the clinical relevance can be considered as low to insignificant.