Introduction

Cognitive behavioral therapy (CBT) is a multiple-component approach that has the most empirical support of any treatment for childhood anxiety disorders (CADs; Kendall 1994; Kendall et al. 1997; Wang et al. 2017). Based on this research, efforts have been made to extend the reach of CBT through community-based effectiveness studies (Southam-Gerow et al. 2010), dissemination to practitioners (Beidas and Kendall 2010), shortened protocols (Beidas et al. 2013), technology-based delivery (Khanna 2014), and identifying outcome moderators (Nilsen et al. 2013). Unfortunately, many of these efforts have met with limited success (Higa-McMillan et al. 2017; Reid et al. 2018; Southam-Gerow et al. 2010; Whiteside et al. 2016a). One obstacle to successfully extending CBT from the laboratory to practice may be the multi-component nature of the intervention in the absence of understanding which components are necessary and sufficient for symptom improvement (Becker-Haimes et al. 2017; Weersing et al. 2009).

The format of CBT for CADs generally combines cognitive strategies (e.g., cognitive restructuring and problem solving), somatic techniques (e.g., emotion identification and relaxation exercises), and behavioral approaches (e.g., exposure and reinforcement; Higa-McMillan et al. 2016) delivered during face-to-face child appointments with some degree of parent involvement. The balance of these components varies greatly in terms of when and how exposure is initiated, what type of cognitive strategies are included, whether relaxation is included, and whether other modules are added, such as social skills training (Ale et al. 2015). A common format of CBT for CADs (Kendall 1994; Kendall et al. 1997), and the protocol included in the largest randomized controlled trial (Walkup et al. 2008), begins with six to eight sessions of anxiety management strategies (AMS; e.g., emotion identification, relaxation training, cognitive strategies) followed by six to eight sessions of exposure to feared stimuli. In spite of CBTs dominance in the child anxiety treatment literature, there is surprisingly little consensus on what components or sequencing of components is optimal for CBT to be effective (Higa-McMillan et al. 2016).

The variability in content among CBT protocols likely results from the lack of evidence regarding which components are necessary and sufficient for treatment success. Exposure has the most expert (e.g., in survey responses, Stewart et al. 2016) and empirical (Peris et al. 2015, 2017) support as an active ingredient. In contrast, the value of other components has not been established. The sequencing of AMS followed by exposure is based upon the assumption that children require AMS to change maladaptive cognitions (Crawley et al. 2013; Kendall 1985; Kendall et al. 1997) or to tolerate exposure (Hirshfeld-Becker et al. 2010a; Manassis et al. 2010). However, this approach runs counter to current theories of exposure’s mechanism of action (i.e., inhibitory learning theory; Craske et al. 2014), which propose that learning is maximized when the patient is most anxious and strongly expects a negative outcome. Moreover, while cognitive restructuring is associated with acceleration in symptom improvement during treatment, relaxation is not (Peris et al. 2015). Similarly, a small randomized controlled trial (Whiteside et al. 2015) and the success of a modularized treatment (Chorpita et al. 2004) suggest that exposure without cognitive restructuring or relaxation strategies is not only feasible, but has the potential to be more effective and efficient than multi-component treatment.

Another issue regarding CBT for CADs is the limited success extending its support beyond the initial efficacy studies. The evidence base of CBT lies primarily in its unequivocal superiority to no-treatment control (Wang et al. 2017) and at least one demonstration of superiority to pill placebo (Walkup et al. 2008). However, remission rates are low (below 50%; Ginsburg et al. 2011) and CBT has not been able to improve upon treatment as usual (TAU; Barrington et al. 2005; Ginsburg et al. 2012; Southam-Gerow et al. 2010). Similarly, CBT for CADs has not been found to reliably outperform active control conditions in meta-analyses (Ale et al. 2015; James et al. 2013), although there have been some promising studies (Beidel et al. 2007; Khanna and Kendall 2010; Silk et al. 2018; Wang et al. 2017). Similarly, CBT has been found to be merely equivalent to medication, insufficient as a monotherapy for severe anxiety symptoms, and inferior to the combination of medication and CBT (Taylor et al. 2018; Walkup et al. 2008; see Wang et al. 2017a for modest advantages of over medication). In the general absence of evidence supporting superiority over other active interventions, it is perhaps understandable that very few clinicians in non-research settings use CBT as described in the most common manuals for CADs (Whiteside et al. 2016a, b). Although other factors, such as clinician resistance likely affect adoption as well (Whiteside et al. 2016a).

The level of empirical support behind similar treatments for related disorders suggests that the incomplete success of CBT for CADs is not inevitable and may stem from the uncertainty regarding which components are essential. For instance, perhaps due to a greater focus on exposure versus relaxation, CBT for pediatric obsessive compulsive disorder (OCD) is significantly more effective than CBT for CADs, more effective than medication, and not necessarily enhanced by the addition of medication (Abramowitz et al. 2005; Ale et al. 2015; Storch et al. 2013). Moreover, evidence-based psychotherapies for youth across diagnoses (Weisz et al. 2013) and CBT for adult anxiety disorders (Stewart and Chambless 2009) are superior to TAU when administered in clinical settings, and the latter also outperforms other credible treatments (Clark et al. 2003; Gould et al. 1995; Simpson et al. 2008). The relative underperformance of CBT for CADs may stem from the vagaries of a multi-component format. For instance, when research-trained therapists (Chu et al. 2015; Southam-Gerow et al. 2010) and community clinicians (Higa-McMillan et al. 2017; Whiteside et al. 2016a, b) deliver CBT, they typically focus on relaxation and cognitive strategies and rely less on exposure, all of which are components in evidence-based manuals.

Given the unfulfilled potential of CBT for CADs, identifying the necessary and sufficient treatment components is arguably the most important question facing the child anxiety treatment field. Without knowing which components to emphasize and refine, CBT will be less effective for children in need (Weersing et al. 2009) and also continue to be too lengthy to be applied in clinical settings (Whiteside et al. 2016). Although there have been repeated calls for dismantling studies (e.g., Higa-McMillan et al. 2016; James et al. 2013; Kendall et al. 1997) which are considered the gold standard for identifying the contributions of various components, there has been limited interest in pursuing such studies. Instead, funding agencies have focused on establishing effectiveness in clinical settings, comparing different approaches to combining CBT with various medications, and examining digital applications of CBT (NIMH 2018; PCORI 2018). Although these studies, and those focusing on the development of abbreviated treatments (Beidas et al. 2013), are important lines of research, they are based on the assumption that efficacious treatments have been developed and necessary and sufficient components have been identified. Because CBT for CADs has not been firmly established as more effective than non-specific therapy, nor is it clear which components need to be included and emphasized, at this time, the field lacks the foundation to support successful extension studies.

To empirically encourage and inform the design of dismantling and other studies to develop more effective and efficient treatments for CADs, the present study examines the content of CBT protocols and the relation of the most common treatment components—exposure and AMS (i.e., relaxation and cognitive strategies)—to symptom improvement. On the basis of a previous analysis (Ale et al. 2015), we began with two a priori hypotheses. First, exposure and cognitive strategies will be included in at least 95% of protocols, whereas relaxation strategies will be included in less than 75%. Second, greater use of in-session exposure will be related to greater symptom reduction, while inclusion of relaxation at any point during treatment will be related to less symptom reduction compared to the absence of relaxation. We also examined other characteristics of treatment protocols, such as length, inclusion of parents, as well as use of groups, and explored their relation to symptom improvement without a priori hypotheses.

Methods

The present manuscript is part of an Agency for Healthcare Research and Quality (AHRQ) funded study, Anxiety in Children, that demonstrated the effectiveness of CBT and selective serotonin reuptake inhibitors for the treatment of a variety of anxiety disorders in youth. The American Psychological Association Meta-Analysis Reporting Standards guidelines were used for the study. The results of the initial meta-analysis have been published previously (Wang et al. 2017) as has a detailed report of the methods, study protocol, and analytic plan pre-approved by AHRQ (Wang 2017b).

Search Strategy

The studies were all identified and included in a recent systematic review and meta-analysis of treatment for CADs (Wang et al. 2017a, b). Eight databases were searched including Ovid MEDLINE(R) In-Process & Other Non-Indexed Citations, Ovid MEDLINE(R), EMBASE, PsycINFO, Cochrane Central Register of Controlled Trials, Ovid Cochrane Database of Systematic Reviews, and SciVerse Scopus from database inception to February 1, 2017. Relevant systematic reviews and meta-analyses, conference proceedings, as well as reference mining of relevant publications, were used to identify additional existing and new literature. Pairs of reviewers screened titles and abstracts of all citations. Full texts of eligible studies were further screened for inclusion.

Study Selection Criteria

Studies from the previous meta-analysis were selected if they included (1) participants aged 3 to 18 diagnosed with an anxiety disorder including panic disorder, social anxiety disorder (or avoidant disorder), generalized anxiety disorder (or overanxious disorder), and separation anxiety (studies of solely specific phobia were excluded due to predominance of single session protocols); (2) assignment to a face-to-face CBT condition or comparator (the comparator could be a second CBT condition) within a randomized controlled trial; (3) a measure of anxiety symptoms; and (4) sufficient data reported to calculate an effect size. Studies were excluded if they (1) focused on comorbid anxiety (e.g., patients with autism and anxiety), (2) examined secondary outcomes of an earlier publication, (3) did not include direct contact with the child (e.g., parent-only interventions, online treatment), (4) examined virtual reality, or (5) combined CBT with medication or pill placebo (however, CBT-monotherapy conditions from combined studies were included, i.e., Walkup et al. 2008). Cognitive behavioral therapy was defined as attempts to change cognition and behavior consisting of some combination of cognitive restructuring, relaxation training, and exposure therapy.

Data Extraction

Data extraction for the initial meta-analysis followed a pilot-tested standardized data extraction form including the following information: author, study design, inclusion and exclusion criteria, patient characteristics, interventions, comparisons, outcomes, and related items for assessing study quality. Data extraction and quality assessment were completed by pairs of independent reviewers. Data from the original manuscript were used for the treatment outcome measure of primary anxiety symptoms, defined as standardized measures of child anxiety symptoms completed by the child, parent, or an independent evaluator (IE).

For the current analyses, treatment protocols were coded for the following characteristics: participants (child only if the parent participation was limited to check-ins, parent and child together, parent and child in separate sessions); format (group vs. individual vs. combined); number of sessions; length of sessions; the presence of relaxation strategies (activities to engage the child in techniques such as diaphragmatic breathing or progressive muscle relaxation) during at least one appointment; the presence of stand-alone (i.e., implemented separately from exposure) cognitive strategies (e.g., problem solving, restructuring, or other thought-based strategies) during at least one appointment; the presence of exposure (e.g., in vivo or imaginal, as well as behavioral experiments) either in-session or assigned outside of session during at least one appointment; and the number of sessions (and proportion of total sessions) that included in-session exposure (i.e., exposure conducted during the session, such as therapist assisted exposure; assigning exposure as homework was not be included as in-session exposure). Relaxation and stand-alone cognitive strategies represent anxiety management strategies (AMS). Information was gathered initially from the source article. When insufficient information was presented in the article, additional information was gathered from treatment manuals, referenced articles, and/or from the original author. Because the design of protocols is of interest here and information on fidelity is rarely published (Higa-McMillan et al. 2016), the coding reflects the components prescribed in the protocol, as opposed to how the treatment was delivered.

The treatment components central to the hypotheses (i.e., relaxation, stand-alone cognitive strategies, exposure) were coded independently by two child psychologists (the first and second authors), blind to each study’s effect sizes and to the other’s coding. Discrepancies between the two coders were resolved by discussion and review of the relevant materials. Items requiring resolution through consensus were then coded by a third child psychologist (fifth author) who was blind to the initial coding and had been independent of the project up to that point. Intercoder agreement was good for items with sufficient variance to assess reliability: Relaxation (kappa = .89 between the initial two coders; 100% agreement between consensus and the third coder) and amount of in-session exposure (Interclass coefficient based on mean-rating, absolute agreement, two-way random effects model = .81 between the initial two coders; and .95 for consensus and the third coder).

Because the vast majority of protocols included any exposure (in-session or assigned), in-session exposure (conducted in the session), and cognitive strategies, these variables had low variance and thus kappa would likely underestimate reliability (Feinstein and Cicchetti 1990; Hallgren 2012; Viera and Garrett 2005). For these variables, agreement on the presence and absence of the components was evaluated separately using Ppos and Pneg (Cicchetti and Feinstein 1990). For the presence of any exposure, the agreement between the initial two coders on the presence was high Ppos = .99, while agreement on the absence was low Pneg = .40. The three items of disagreement were resolved through discussion and the third coder had 100% agreement with the consensus decision. For the presence of any cognitive strategies, the agreement on the presence was high Ppos = .99, while agreement on the absence was moderate Pneg = .67. The three items of disagreement were resolved through discussion and then a second round of discussion with the third coder. For the presence of in-session exposure, the agreement on the presence was high Ppos = .88, while agreement on the absence was low Pneg = .45. Of the 22 rating disagreements, 10 were clerical errors, 11 were resolved after consulting supportive material, and 1 involved deciding that behavioral experiments constituted exposure. Agreement between the consensus rating and third coder on the consensus items was moderate, Ppos = .56, Pneg = .73. The remaining eight discrepancies were resolved by consensus of all three coders.

Risk of Bias Assessment

We evaluated the risk of bias of the included studies using Cochrane Risk of Bias tool (Higgins and Green 2011).

Data Synthesis and Analyses

We first used descriptive analyses to examine the protocol duration (number of sessions multiplied by length of sessions), format, and treatment components. Correlations, analyses of variance (ANOVAs) with post hoc least significant differences (LSD) comparisons, and independent samples t tests were conducted to examine the relation between characteristics and time of publication (in order to accurately document the current state of the treatment literature), as well as between characteristics and treatment duration. To examine the association between treatment characteristics and outcome, we selected the continuous measures of primary anxiety symptoms from the original meta-analysis (Wang et al. 2017) as the outcome measure because such measures are more ubiquitous in source RCTs and maximize variability, and thus power, compared to dichotomous measures, such as diagnostic status. We calculated between-group effect sizes measured by the difference in postintervention scores between CBT groups and no-treatment control groups using standardized mean difference (SMD) based on Cohen’s d and related standard error. Effect sizes were computed separately for child, parent, and independent evaluator report. Negative coefficients indicate that the presence of the component (a positive value) is associated with greater decrease in symptoms (a negative value).

Simple random effect meta-regressions were used to examine the relation between treatment characteristics and effect size using Stata Version 14.1 software. Multivariate meta-regression was not used due to the small sample sizes. Significance was considered at P < .05. One-tailed tests were used for a priori hypotheses regarding the association of exposure and relaxation with outcome; two-tailed tests were used for exploratory analyses with no a priori hypotheses. Because less than one-third of the studies included a no-treatment control group, the sample size and power were limited, raising the risk of Type II error. Therefore, we repeated the analyses within the entire sample of source RCTs with effect sizes calculated on the change from pre- to postintervention, using correlations from previous literature to account for the lack of independence between time points. Despite inherent statistical limitations, pre- to post-treatment effect sizes can be useful when the limitations are taken into consideration (Cuijpers et al. 2017). We did not examine between-group effect sizes with active control groups due to the small number of studies and heterogeneity of content.

Results

The original meta-analysis search identified 27,250 potential studies (after removal of duplicates), of which 24,141 were excluded based on title and abstract review. For the remaining 3109 studies, the full text was reviewed and 3014 were removed because they did not meet the inclusion criteria. This review resulted in the 95 comparative effectiveness studies included in the original meta-analysis (Wang et al. 2017). From these, 20 were removed because they did not meet the additional inclusion criteria for the current study (e.g., medication trials). This final step resulted in 75 RCT studies including 111 conditions published between 1994 and 2016 that included youth with an anxiety disorder, a CBT condition, and a comparison condition. These studies included 5412 patients with an average age of 11.27 years (range of average age from 5.8 to 15.8) who were 51.16% female and 77.22% Caucasian. Sample sizes ranged from 11 to 488. Fifty-one studies included patients with separation anxiety disorder, 55 with generalized anxiety disorder, 57 social anxiety disorder, 41 specific phobia, and 27 panic disorder. A table in the appendix (Table 5) presents all the included studies with references and coded characteristics. We found that the risk of bias of the included studies was moderate to high due to lack of blinding of outcome assessors. Risk of bias was unrelated to protocol components, P’s > .16.

Treatment Characteristics

Duration

Descriptive information of the CBT protocols is presented in Table 1. The protocols consisted most commonly of approximately 12, 1-h sessions. The quartile distribution for number of sessions was 11 or fewer sessions (15.32%), 12 sessions (31.53%), 13 to 17 sessions (27.03%), and 18 to 32 sessions (25.23%). For total time duration, the quartile distribution was 4 to 13.5 h (25.45%), 14 to 17 h (25.45%), 18 to 20 h (28.18%), and 21 to 49.93 h (20.91%). There were no significant correlations between publication year and number of sessions, length of sessions, or total duration (r < 0.15, P > .12).

Table 1 Characteristics of CBT protocols from RCTs: length and participants

Participants

Children were most commonly treated with minimal parent inclusion, with approximately one-quarter of the protocols working with children and parents separately, and the remaining working with parents and children together (Table 1). The proportion of protocols that worked primarily with children alone was not related to publication year (r = − 0.23, P = .33), with every year except 1999 and 2015 having at least 50% primarily child-alone. An ANOVA with post hoc LSD comparisons indicated that the average number of total sessions differed by participants, F (2, 107) = 11.89, P < .001, with protocols using separate child and parent sessions having more appointments, 18.46 sessions (SD = 4.98) than both child-only (13.71 sessions; SD = 4.29) and combined parent and child (13.62 sessions; SD = 4.23; all P < .01) protocols, while the latter two did not differ significantly. The same pattern was present for the total treatment duration, F (2, 106) = 3.55, P < .05, with protocols using separate child and parent sessions requiring more total treatment time, 21.36 h (SD = 7.35) than both child-only (17.32 h; SD = 8.08) and combined parent and child (15.83 h; SD = 4.21; all P < .05) protocols.

Slightly more than one-half of the protocols were delivered in a group setting, with the others delivered to individuals and a small number with a combined format. The proportion of protocols delivered in individual versus group format (combined protocol was excluded because of small numbers) was not significantly related to publication year (r = − 0.20, P = .39). A series of independent samples t tests comparing the length of individual versus group treatment protocols indicated that individual-based protocols had shorter average session length, 56.98 min (SD = 10.172) versus 90.40 (SD = 51.15), t (101) = 4.22, P < .001, and total treatment time, 14.06 h (SD = 3.80) versus 19.34 h (SD = 7.30), t for variances not assumed to be equal (93.34) = 4.76, P < .001. Individual and group settings did not differ significantly in number of sessions, 14.98 (SD = 4.03) versus 13.87 (SD = 4.55).

Components

Information on protocol treatment components is presented in Table 2. All but two protocols used exposure (either in-session or assigned outside of session), with most including in-session exposure (e.g., completed with therapist in the session) and stand-alone cognitive strategies, and one-half including relaxation. Approximately 6% of protocols did not include any stand-alone cognitive strategies or relaxation (i.e., no AMS). The proportion of sessions within protocols that included in-session exposure varied widely and averaged approximately one-third. The quartile distribution for amount of in-session exposures was none (23.85%), 6% to 41.67% (25.69%), 42% to 59% (24.77%), and 64% to 86% (25.69%). The inclusion of treatment components over time is presented in Fig. 1. Publication year was not significantly related to the proportion of sessions including in-session exposure (r = − 0.11, P = .26) or the number of in-session exposure sessions (r = − 0.18, P = .068; not included in figure). The proportion of protocols including stand-alone cognitive strategies was not significantly related to year (r = − 0.35, P = .12). A significant decrease over time was observed in proportion of protocols including relaxation at any time (r = − 0.71, P < .001). However, seven of 22 studies, approximately one-third, published over the final 5 years included relaxation without a non-relaxation comparator.

Table 2 Therapeutic components of CBT protocols from RCTs
Fig. 1
figure 1

Proportion of sessions including in-session exposure and proportion of protocols including stand-alone cognitive strategies or relaxation by year of publication

Protocols that included relaxation at any time had a similar number of appointments with in-session exposure as those that did not include relaxation, 4.66 (SD = 3.63) versus 5.47 (SD = 3.60), respectively. Protocols that included stand-alone cognitive strategies had significantly fewer appointments with in-session exposure than protocols that did not include stand-alone cognitive strategies, 4.78 (SD = 3.52) versus 9.29 (SD = 2.36), respectively, t for variances not assumed to be equal (7.96) = 4.70, P < .01. An ANOVA with post hoc LSD comparisons indicated the number of appointments that included in-session exposure differed by inclusion of parents, F (2, 106) = 19.04, P < .001, with protocols having separate parent sessions, 2.07 (SD = 3.28), including fewer in-session exposure sessions than those including children alone, 6.41 (SD = 2.98), or combined sessions, 4.68 (SD = 3.56), all P’s< .05, while the latter two did not differ significantly.

Treatment Outcome

The associations of treatment characteristics with between-group and within CBT-group pre- to post-treatment effect sizes are presented in Tables 3 and 4, respectively. The primary analyses indicated that protocols with any in-session exposure (e.g., completed with therapist during a session) were significantly associated with greater between-group effect sizes compared to protocols without any in-session exposure based on child and parent report. Similarly, the coefficients examining the amount of in-session exposure sessions indicated that more in-session exposure was significantly related to larger between-group effect sizes for all three reporters. However, within the secondary analysis, the presence and amount of exposures was significantly related to larger effect sizes only for child report. In contrast, the primary analyses indicated that the presence of relaxation strategies in protocols was not significantly related to between-group effect sizes. However, within the secondary analysis, the presence of any relaxation strategies within a protocol was significantly associated with smaller pre- to post-treatment effect sizes for all three reporters compared to protocols without relaxation. The coefficients for cognitive strategies were non-significant across the primary and secondary analysis. Additional findings that were not hypothesis driven include year of publication was not significantly associated with effect size; longer treatment duration was significantly associated with larger pre- to post-treatment effect sizes for parent report; group treatment was significantly associated with greater between-group effect sizes for child and IE report, as well as with pre- to post-treatment effect sizes for child report; and inclusion of parents was significantly associated with smaller pre- to post-treatment effect sizes for child report.

Table 3 Primary analyses: treatment characteristics and outcome for between-group effect sizes (CBT vs. No-treatment control)
Table 4 Secondary analyses: treatment characteristics and outcome for pre- to post-treatment effect sizes (All studies)

Discussion

The present meta-analysis examined characteristics and components of CBT for CADs and the association of these components with treatment outcome. In general, CBT consisted of 12 to 18 one-hour sessions in either a group or individual format involving the child with minimal parent involvement. Treatment components almost always consisted of exposure (when defined as either in session or assigned outside of session), stand-alone cognitive strategies (i.e., delivered apart from exposure), and frequently relaxation. Over the past quarter century, the inclusion of these components in CBT for CADs has stayed stable, although the inclusion of relaxation has declined. Within this homogeneity, there is considerable variability in the emphasis on and application of these components; some protocols did not include relaxation, some protocols did not include the completion of in-session exposure and others included in-session exposure in the majority of appointments. Given our findings that these primary treatment components are differentially associated with symptom improvement, this inconsistency is likely to have significant consequences for the overall efficacy of CBT for CADs, as well as studies of effectiveness, dissemination, and technology-based delivery.

Overall, this study supported the hypothesis that more in-session exposure (e.g., therapist assisted exposure) was associated with larger treatment effects. Of note, the benefit of more in-session exposure was found within the context of almost all protocols including exposure (either in-session or as homework) and emphasizes the importance of exposure dosage. This potential benefit of in-session exposure is consistent with expert consensus and a growing body of empirical work suggesting that exposure is the active treatment ingredient of CBT for CADs (Peris et al. 2017; Stewart et al. 2016). However, the current findings provide the most extensive empirical support to date for the importance of exposure. In spite of the importance of in-session exposure, this component was delivered in only a third of sessions—with almost one-fourth of protocols not including it at all—and the amount of in-session exposure has not changed over time. These findings suggest that the importance of in-session exposure has not been adequately translated into treatment design. Moreover, there is little to no information about the quality of exposure delivery, which is likely to vary between therapists using the same protocols (i.e., POTS 2004). Variability in quality and intensity of exposure delivery, or introduction of increased heterogeneity between studies from using pre-post effect sizes, may have contributed to the limited statistical significance in the secondary analyses.

In contrast to the support for in-session exposure, the present data did not support the benefit of including relaxation exercises. In particular, 50% of protocols did not include any relaxation which contradicts the argument that youth require coping skills to be able to tolerate exposure (Hirshfeld-Becker et al. 2010; Manassis et al. 2010). Perhaps more concerning is that the presence of relaxation within protocols was associated with less effective treatment, a finding consistent with previous research (Ale et al. 2015; Whiteside et al. 2015). Although this finding only reached statistical significance in the secondary analyses, the fact that the primary effect sizes (i.e., between group) were of similar magnitude and that two of three could be defined as trending toward significance suggests that lack of statistical significance in the primary analyses more likely resulted from lack of power, than lack of effect. The mechanism for a potential detrimental effect is unclear, but could result from protocols that include relaxation prescribing less in-session exposure, although this association was not found here. Alternatively, including relaxation in a protocol could lead therapists and patients to use this strategy instead of more demanding exposure (Becker-Haimes et al. 2017). Finally, relaxation may reduce the effectiveness of exposure by interfering with learning to accept anxious feelings (Craske et al. 2014). Despite the that fact that such processes are speculative and the current analyses cannot demonstrate a causal relationship, the present study observed no evidence supporting the benefit of adding relaxation to treatment protocols.

Unlike in-session exposure and relaxation, we were not able to identify a discernable effect of stand-alone cognitive strategies on treatment outcome. The ubiquity with which protocols include stand-alone cognitive strategies and our broad operationalization of this component likely reduced our ability to detect significant differences. In addition, the wide patient age-range may have obscured developmental differences in the application of, and response to, cognitive interventions. However, the lack of clear benefit coupled with the fact that 5% of protocols did not include stand-alone cognitive strategies challenges the assumption that this component is a necessary precursor for children to learn from exposure (Crawley et al. 2013; Kendall 1985; Kendall et al. 1997). Moreover, none of the three studies that directly compared CBT with and without cognitive strategies found additive benefit of this component (Rosa-Alcazar et al. 2013a; Sanchez-Garcia and Olivares 2009; Whiteside et al. 2015). Mechanistically, inclusion of stand-alone cognitive strategies was associated with less in-session exposures, which could in turn reduce effectiveness. Although the effect was not observed with cognitive strategies, the potential for a component to crowd out exposure was supported by the observation that inclusion of parents was associated with fewer in-session exposures and smaller treatment effect, at least in one analysis.

Limitations

The results of the current study must be interpreted in the context of the following limitations. The coding of components was limited by the information that could be identified in the original articles and available source materials. This process was complicated by the often brief descriptions of protocols in the source RCTs, changes in how the same protocol was administered in different RCTs, Spanish-language materials, and difficulties collecting supporting materials for studies that had been conducted many years ago. Despite these limitations, agreement between coders was generally high and the coders were able to reach consensus on all decisions. Moreover, as opposed to subjective ratings, e.g., competency implementing procedures, the present ratings reflect the more objective content of the original protocol and the coding decisions presented in the appendix can be reviewed for accuracy. The component categories were also a limitation by including multiple techniques that may have different levels of efficacy. This is particularly true for exposure which likely included exercises of limited potency (e.g., imaginal exposures in preparation for in vivo exposures) and cognitive strategies which combined a wide variety of techniques, some of which may have differing efficacy based on the child’s developmental level. Moreover, the current study was limited to the most common treatment components (Higa-McMillan et al. 2016) and future research should examine others, such as behavior management, as well as the efficacy of specific techniques within broader categories. Relatedly, because the source studies did not consistently measure fidelity (Higa-McMillan et al. 2016), component presence reflects what was prescribed in the manual, rather than what was implemented. As such, information about the quality of in-session exposure application was not available.

The statistical analyses, both the primary and secondary, also limit the conclusions drawn from the current study. By requiring the presence of a waitlist control group, the primary analyses suffer from limited sample size and power. In contrast, by not including data from the control group, the secondary analyses do not control for differences between studies. The fact that no variable was significantly associated with treatment outcome across all three reporters in both analyses likely reflects those limitations. However, the high degree of consistency in effect size direction across reporters and analyses for in-session exposure and relaxation supports the current interpretations. More broadly, the study design and statistical analyses allow for examination of correlational associations, but cannot establish causality. Therefore, although the results provide guidance regarding what changes to CBT protocols are likely to meaningfully improve outcomes, these data-driven hypotheses need to be examined directly through dismantling studies.

The source RCTs also present some limitations. As is standard in the field of child anxiety disorders, the included studies covered multiple anxiety disorders in different combinations, and thus it is possible that certain protocols were more effective because they were implemented with disorders that are more responsive to treatment. Moreover, the results are limited by the available data, which in this case involves few studies designed to directly compare the relative contributions of components, as well as variability in outcome measures. As such, determining which components are necessary and sufficient requires dismantling studies and future studies should examine the relation between components and functional outcomes as well as diagnostic status. Finally, the current study does not include studies that may have been published most recently. However, the original literature search was one of most thorough and inclusive to date and relying on this search reduces the risk of bias when selecting source RCTs to include in a meta-analysis.

Despite these limitations, the current study has implications for interpreting the current literature. To begin with, there is significant and likely meaningful variability in the content of treatment referred to as CBT for CADs. Given the apparent differential efficacy of exposure and relaxation, if CBT as a term includes interventions consisting of relaxation without in-session exposure and interventions consisting of predominately in-session exposure and no relaxation, the label has lost much of its meaning. If the current results supporting the relative importance of exposure are replicated in direct investigations, perhaps the term “exposure-based therapy” more clearly communicates interventions most likely to be successful. The relevance of this shift in terminology is illustrated by the fact that the amount of in-session exposure in protocols has not increased over the past 20 years despite longstanding emphasis on this technique from experts (Kendall et al. 2005).

Second, conclusions based upon existing research employing the traditional CBT protocol for CADs, i.e., AMS (relaxation and cognitive strategies) followed by exposure, should be examined more skeptically. Because this format of CBT includes relaxation and underemphasizes in-session exposure, it may be underpowered. As such, estimates of the magnitude of CBT’s effect on symptoms and diagnostic remission, ability to improve upon TAU, comparative effectiveness to medication, and adequacy as a monotherapy for severe anxiety may improve with re-examination using a more focused exposure-based therapy. In fact, protocols that emphasize implementing in-session exposure without relaxation have already demonstrated superiority to medication (Beidel et al. 2007). Moreover, when evaluating the effect of treatment format (e.g., group vs. individual or child vs. parent; i.e., Manassis et al. 2014), attention should be paid to the effect on other components (e.g., parent anxiety management at the expense of in-session exposure). Finally, findings regarding therapists’ willingness and ability to adopt evidence-based treatment into community practice should be interpreted in light of the fact that the research field has neither decided, nor clearly communicated, what must be included in evidence-based treatment (Becker-Haimes et al. 2017).

To address these challenges, the implications of this study need to be tested through direct experimental manipulation. The current study strongly points to the need for dismantling studies to guide the development of CBT beyond the predominant AMS followed by exposure model. Specifically, the current study provides an empirical basis to inform the design of dismantling studies to maximize the likelihood of success (i.e., exposure only vs. exposure after relaxation and cognitive restructuring). Moreover, the current data suggest that simply lengthening treatment by adding in more components or changing format without considering the effect on other components (e.g., including parents at the expense of exposure) is unlikely to improve outcomes. If the current results are confirmed, enhancing therapy for CADs by eliminating relaxation and increasing in-session exposure not only has the potential to address the comparative effectiveness questions mentioned above, it can also facilitate the development of technology-based delivery, abbreviated treatments, and dissemination efforts. Specifically, rather than replicating the modest success of multi-component CBT in new platforms, studying exposure-based therapy has the potential to improve upon existing treatment options. If preferable, dismantling designs can be incorporated into extension research, such as comparing shortened treatments that are exposure-focused versus multiple-component protocols. Moreover, future research can focus on improving the delivery of exposure to maximize its effectiveness (Benito et al. 2018; Deacon et al. 2013). Finally, there should be further efforts to determine whether disseminating exposure as an evidence-based principle, including during clinical training programs, is more effective than the current efforts to disseminate multi-component CBT.

The current study marks an important addition to the field of CADs treatment. The literature to date has been sufficiently broad and varied to empirically examine which components of CBT are most likely to enhance or diminish symptom improvement. If the current results are confirmed upon direct examination, exposure-based therapy for CADs has the potential to be more effective, efficient, and amenable to dissemination than previous multi-component protocols. Perhaps if researchers begin to focus on methods for improving the delivery of in-session exposure without moderation from other components, psychotherapy can become the clear first-line monotherapy option for CADs.