FormalPara Key Points

We found controlled studies for only nine out of 87 medications of interest; 21 randomized controlled trials provided data on 1768 participants.

Second-generation antipsychotics, anticonvulsants, and antidepressants were not able to consistently reduce the severity of borderline personality disorder.

Low- and very-low-certainty evidence indicates that anticonvulsants can improve anger, aggression, and affective lability, however, the evidence is mostly limited to single studies.

1 Introduction

Borderline personality disorder (BPD) is a debilitating psychiatric disorder, characterized by a long-term pattern of instability of interpersonal relationships, distorted self-image, marked impulsivity, and affective instability. Individuals with BPD have significant functional impairment, high rates of comorbid mental disorders, substance use, deliberate self-harm, and suicidal ideation and behavior [1, 2]. By Diagnostic and Statistical Manual of Mental Disorders (DSM-5) definition, BPD has an onset in adolescence or early adulthood, with enduring patterns of inner experience and behavior that deviate markedly from societal and cultural norms, and are stable and inflexible [3]. The International Statistical Classification of Diseases and Related Health Problems (ICD-10) [4] refers to BPD as emotionally unstable personality disorder but has similar diagnostic criteria to the DSM-5.

The exact etiology of BPD is still unclear and is likely multifactorial and heterogeneous; current explanations assume the stress-diathesis model, with an interaction between the experience of traumatic events during childhood (e.g., sexual abuse, neglect) and genetic factors [5]. Symptoms of BPD often first appear during adolescence [6]. Although the majority of individuals with BPD experience a decline of symptoms during adulthood and about 85% reach diagnostic remission within 10 years after diagnosis [7], specific symptoms, such as fear of abandonment, impulsivity, intense anger, and an unstable self-image, can persist over a lifetime and affect social functioning. Individuals with BPD commonly have other mental disorders, such as depression, anxiety, post-traumatic stress disorder, substance use disorder, and eating disorders. They frequently face social stigma, have poor social and occupational outcomes [8], and have a substantial risk for premature death through suicide [9].

The estimated prevalence of BPD in the general population in Western countries ranges between 0.4 and 3.9% [10]. Women are more frequently diagnosed with BPD than men, but it is unclear whether BPD is actually more common in women than men. In clinical psychiatric populations, the prevalence of BPD is high and estimated at 10% for outpatients and 15–25% for inpatients [11, 12]. Individuals with BPD are also frequent users of general primary care. The lifetime prevalence of BPD among primary care patients is about four times higher than in the general population [13]. Consequently, the societal costs of BPD are substantial; the annual direct healthcare costs and indirect costs in terms of lost productivity are >16 times higher among patients with BPD compared with matched controls without BPD [14].

According to a 10-year follow-up study in the United States, an estimated three-quarters of patients with BPD seek help from professional mental healthcare services [15]. Clinical practice guidelines recommend psychotherapies as first-line treatments for BPD [16,17,18,19], in particular, dialectical behavior therapy (DBT), a structured and manualized therapy.

Currently, no medications have been approved by regulatory agencies for the treatment of BPD. Nevertheless, up to 96% of patients with BPD who seek treatment receive at least one psychotropic medication [20] and polypharmacy for BPD is common [21, 22]. Almost 19% of patients with BPD report four or more psychotropic medications [23]. Recommendations of clinical practice guidelines regarding pharmacotherapy vary. The National Institute for Health and Clinical Excellence (NICE) in the United Kingdom [24] and the Australian National Health and Medical Research Council [25] recommend avoiding pharmacotherapies as first-line treatments except in acute crisis. Other professional societies or consensus statements view pharmacotherapies as adjunctive treatments, mainly to target symptoms of BPD, such as anger, aggression, and impulsiveness, or symptoms and comorbidities that are commonly associated with BPD, such as anxiety or depression [26,27,28]. Table 1 summarizes commonly used medication classes used to treat common symptoms associated with BPD.

Table 1 Medication classes used to treat symptoms of borderline personality disorder [26, 27]

The last systematic assessment of the efficacy and risk of harms of pharmacotherapy for the treatment of BPD was a Cochrane review in 2010 [29]. It concluded that second-generation antipsychotics and anticonvulsants have beneficial effects on individual symptoms of BPD, although the evidence was mostly based on single studies [29]. In 2017 and 2020, journal publications of focused updates of the Cochrane review did not formally assess the risk of bias of new studies and the certainty of the evidence [30, 31].

The objective of this systematic review was to support the American Psychiatric Association (APA) in developing clinical practice guidelines on the appropriate use of pharmacological and nonpharmacological treatments for patients with BPD. This manuscript summarizes the general efficacy and the comparative effectiveness of different pharmacological treatments for BPD patients.

2 Methods

The methods for this systematic review followed the Agency for Healthcare Research and Quality (AHRQ) Methods Guide for Effectiveness and Comparative Effectiveness Reviews (available at http://www.effectivehealthcare.ahrq.gov/methodsguide.cfm) and the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist [32]. We registered the protocol of this review on PROSPERO (Registration #: CRD42020194098).

Our review addressed the following key questions:

  • In patients with borderline personality disorder, what is the efficacy, comparative effectiveness, and risk of harms of various pharmacological therapies?

  • Are there differences in efficacy, comparative effectiveness, or risk of harms regarding different subgroups based on age, gender, race/ethnicity, or genotypes?

Figure 1 presents the analytic framework for our key questions.

Fig. 1
figure 1

Analytic framework

2.1 Literature Searches

We built our search strategy on an earlier search commissioned by APA to identify studies through December 2017 (Supplementary Table 1, see electronic supplementary material [ESM]). To ensure optimal recall, we ensured that our search strategy still detected all studies that met inclusion criteria of the original search. We searched MEDLINE, EMBASE, the Cochrane Library, and PsycINFO from January 1, 2018, to April 6, 2021, using a variety of terms, medical subject headings (MeSH), and major headings limited to English language and human-only studies (Supplementary Table 1, see ESM).

To minimize retrieval bias, we manually searched reference lists of landmark studies and background articles on this topic for relevant citations that electronic searches might have missed.

2.2 Criteria for Inclusion/Exclusion of Studies in the Review

Our population of interest were patients 13 years or older from a country with a very high human development index, with a diagnosis of BPD based on the DSM, versions IV or V [3], or the ICD-10 [4]. As interventions, we included commonly used drug classes for the treatment of BPD, such as anticonvulsive medications, antidepressants, antipsychotic medications, benzodiazepines, melatonin, opioid agonists or antagonists, and sedative or hypnotic medications with a treatment duration of at least 8 weeks. Overall, these drug classes included 87 different pharmacotherapies. Outcomes of interest included severity of BPD, improvement of symptoms associated with BPD (e.g., aggression, anger, self-harm), general psychiatric symptoms, functioning, and adverse events. Supplementary Table 2 provides a detailed presentation of inclusion and exclusion criteria (see ESM).

2.3 Literature Review, Data Abstraction, and Data Management

We used DistillerSR to screen the literature (DistillerSR, Evidence Partners, Ottawa, Canada). Two reviewers independently reviewed all titles, abstracts, and full-text articles. Discrepancies were resolved by consensus or by involving a third, senior reviewer. All studies identified as meeting inclusion criteria through the earlier APA search were screened again and included in our review if they met inclusion criteria. Supplementary Table 3 presents the list of studies excluded (with reasons) at the full-text level (see ESM).

For data extraction, we designed, pilot tested, and used a structured data form in DistillerSR to ensure consistency of data extraction. One reviewer extracted data, a second team member verified extracted study data for accuracy and completeness.

2.4 Assessment of Risk of Bias of Individual Studies

To assess the risk of bias of eligible studies, two independent reviewers used the Cochrane Risk of Bias tool 2.0 [33]. They rated risk of bias at an outcome level if methodological limitations affected different outcomes in a different way. We assigned a ‘high risk of bias’ rating to studies that had very serious limitations in design or conduct that might invalidate findings regarding all or individual outcomes. We resolved disagreements by discussion and consensus or by consulting a third member of the team.

2.5 Data Synthesis

In general, we summarized included studies in narrative form. We considered meta-analysis if studies were similar in population, interventions, comparators, and outcomes. For all analyses, we used random-effects models (restricted maximum likelihood random effects) to estimate pooled effects. To determine whether quantitative analyses are appropriate, we assessed the clinical and methodological heterogeneity of the studies under consideration following established guidance [34]. We assessed statistical heterogeneity in effects between studies by calculating the chi-squared statistic and the I2 statistic (the proportion of variation in study estimates attributable to heterogeneity). We examined potential sources of heterogeneity using sensitivity analyses. When quantitative analyses were not appropriate (e.g., due to heterogeneity, insufficient numbers of similar studies, or insufficiency or variation in outcome reporting), we synthesized the data narratively. For statistical analyses, we used Stata, version 16.1 (Stata Corporation, College Station, Texas, USA).

2.6 Grading the Certainty of Evidence for Major Comparisons and Outcomes

We graded the certainty of evidence of relevant outcomes based on current GRADE (Grading of Recommendations Assessment, Development and Evaluation) guidance [35]. Developed to grade the overall certainty of a body of evidence, this approach incorporates five key domains: (1) risk of bias, (2) inconsistency, (3) indirectness, (4) imprecision of the evidence, and (5) reporting bias. It also considers other optional domains that may be relevant for some scenarios. These included plausible confounding that would decrease the observed effect and strength of association (i.e., magnitude of effect) or factors that would increase the strength of association (i.e., dose–response effect). Two reviewers assessed each domain for each selected outcome and resolved differences by consensus discussion. We documented all decisions regarding up- or down-grading the certainty of evidence to ensure transparency. We used GradePro (https://gradepro.org) to develop summary of findings tables.

2.7 Role of the Funding Source

This review was funded by APA. The APA Clinical Guidelines Committee assisted in the development of key questions, study inclusion criteria, and outcome measures of interest but was not involved in data collection, analysis, or manuscript preparation.

3 Results

Of 14,797 unique records, we included 21 randomized controlled trials (RCTs) [36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56]. We did not find any eligible observational studies. Figure 2 presents the literature search and selection.

Fig. 2
figure 2

Literature search and selection. APA American Psychiatric Association

Nineteen RCTs compared pharmacotherapies with placebo [36,37,38,39,40,41,42, 44,45,46,47,48,49,50,51,52,53,54, 56] and two RCTs provided evidence on head-to-head comparisons of pharmacotherapies [43, 55]. Studies were limited to second-generation antipsychotics, anticonvulsants, and second-generation antidepressants, assessing nine individual drugs, out of 87 drugs of interest. We did not find any eligible evidence on benzodiazepines, melatonin, opioid agonists or antagonists, and sedative-hypnotic medications.

We rated two studies as low risk of bias [36, 37], seven studies as moderate risk of bias [38,39,40,41,42,43,44], and 12 as high risk of bias [45,46,47,48,49,50,51,52,53,54,55,56]. The main reason for ratings of high risk of bias was high attrition. Supplementary Fig. 1 provides a detailed presentation of risk of bias ratings (see ESM).

Overall, trials included data on 1768 participants. The majority were females (79%) and white (76% when reported). Study durations ranged from 8 to 52 weeks. Studies, in general, excluded patients with psychiatric comorbidities, such as schizophrenia, major depressive disorder, alcohol or substance use disorder, or bipolar disorder. Most studies classified enrolled participants as ‘moderately ill’ but few studies provided assessments at baseline about individuals’ functioning.

Table 2 summarizes the main characteristics of included RCTs. Supplementary Table 4 presents more detailed information on study characteristics and treatment effects; Supplementary Table 5 summarizes certainty-of-evidence ratings (see ESM).

Table 2 Characteristics and risk of bias ratings of included studies

In the following sections, we first present evidence on the general efficacy and risk of harms of treatments for BPD, followed by the comparative effectiveness and risk of harms. For each intervention, we summarize findings for four outcome domains when available: (i) severity of BPD, (ii) severity of symptoms associated with BPD, (iii) general psychopathology and functioning, and (iv) incidence of adverse events, serious adverse events, and withdrawal due to adverse events. Effect measures for efficacy and effectiveness presented in the following sections are mean changes from baseline on clinical assessment scales for each treatment group. Supplementary Table 6 summarizes characteristics of commonly used scales for the clinical assessment of patients with BPD (see ESM).

3.1 Second-Generation Antipsychotics versus Placebo

Nine double-blinded RCTs evaluated the efficacy of second-generation antipsychotics [38, 41, 44,45,46, 48, 50, 52, 56]. Overall, these studies provided data on 1124 participants. We rated two studies as moderate [38, 41] and seven as high risk of bias [44,45,46, 48, 50, 52, 56]. The majority of trials assessed olanzapine [44,45,46, 48, 52, 56]; single RCTs determined the efficacy of aripiprazole [38], quetiapine extended release (ER) [41], and ziprasidone [50]. Follow-up durations ranged from 8 weeks to 6 months. All trials, except one [38], were funded by the pharmaceutical industry.

3.1.1 Severity of Borderline Personality Disorder

Two multinational, high risk of bias RCTs reported mixed results regarding the efficacy of olanzapine to reduce the severity of BPD after 12 weeks of treatment [44, 48]. In the flexible-dose arm (n = 148) of a three-armed trial (N = 451), participants on olanzapine (5–10 mg/day) showed significantly greater improvements on the Zanarini Rating Scale for BPD than those treated with placebo (8.5 vs 6.8; p = 0.01) [44]. By contrast, the second trial (N = 314) and the fixed-dose arm of the three-armed trial [44] achieved no significant differences between olanzapine (5–20 mg/day or 2.5 mg/day) and placebo on the Zanarini Rating Scale for BPD [48].

A fixed-dose trial assessing quetiapine extended release (ER) (N = 95), rated moderate risk of bias, reported significant improvements on the Zanarini Rating Scale for BPD for low-dose (150 mg/day) but not moderate-dose (300 mg/day) quetiapine ER compared with placebo after 8 weeks of treatment (p = 0.03; treatment effects not reported [NR]) [41].

We rated the certainty of evidence as low for no effect of assessed second-generation antipsychotics to reduce the severity of BPD.

3.1.2 Severity of Symptoms Associated with Borderline Personality Disorder

Included studies reported mixed results regarding improvements of depressive symptoms, anger, impulsiveness, aggression, and self-harm with second-generation antipsychotics. A random-effects meta-analysis with data on 497 participants favored second-generation antipsychotics over placebo for the reduction of depressive symptoms but rendered no significant difference (standardized mean difference 0.28, 95% confidence interval [CI] − 0.05 to 0.60; Fig. 3) [38, 44, 46, 50, 52].

Fig. 3
figure 3

Standardized mean differences of changes of depressive symptoms for second-generation antipsychotics versus placebo. CI confidence interval, N sample size, REML restricted maximum likelihood, SD standard deviation (Linehan, 2008 [52]; Nickel, 2006 [38]; Pascual, 2008 [50]; Soler, 2005 [46]; Zanarini, 2001 [45])

One study (N = 52), rated moderate risk of bias, reported significant improvements of anger for participants treated with aripiprazole (15 mg/day) compared with those on placebo (State-Trait Anger Expression Inventory: 13.6 vs 5.7; p < 0.001) [38]. Two RCTs (N = 95 and N = 60), one moderate risk of bias, the other high, detected no significant improvements in impulsiveness on the Barratt Impulsiveness Scale for quetiapine ER (150 and 300 mg/day) [41] or ziprasidone (40–200 mg/day) [50] compared with placebo. Likewise, two high risk of bias RCTs reported no improvements of self-harm with olanzapine (5–20 mg/day) [46, 52].

Regarding improvement of aggression, one moderate (N = 451) [44] and two high risk of bias RCTs (N = 40 and N = 24) [52, 56] reported no significant differences between olanzapine (2.5–20 mg/day) and placebo on the Modified Overt Aggression Scale. By contrast, an RCT (N = 95) rated moderate risk of bias detected significant improvements for quetiapine ER (150 and 300 mg/day) compared with placebo on the Modified Overt Aggression Scale (treatment effects NR; p = 0.01) [41].

We rated the certainty of evidence as low for a beneficial effect of aripiprazole to improve anger and for quetiapine ER to improve aggression. The certainty of evidence was low for no effect of other second-generation antipsychotics to improve depressive symptoms, impulsiveness, aggression, and self-harm.

3.1.3 General Psychopathology and Functioning

Six RCTs assessed effects of second-generation antipsychotics on global scales, such as the Symptom Checklist-90–Revised [38, 41, 44, 50, 56] or the Clinical Global Impression scale [38, 41, 44, 46, 50, 56]. Three moderate risk of bias RCTs with a total of 598 participants reported significantly greater improvements on the Symptom Checklist-90–Revised for participants treated with second-generation antipsychotics (aripiprazole 15 mg/day, olanzapine 5–10 mg/day, quetiapine ER 150 mg/day) compared with participants in the placebo groups [38, 41, 44]. Only one small trial (N = 52) reported differences in effect estimates on the Symptom Checklist-90–Revised (15.0 vs 4.9; p < 0.001) [38]. Three high risk of bias RCTs, two on olanzapine (2.5–20 mg/day; N = 40 and N = 60) [46, 56], the other on ziprasidone (40–200 mg/day; N = 60) [50], favored second-generation antipsychotics over placebo but rendered no significant differences between active treatments and placebo.

Three trials, two moderate [41, 44] and one high risk of bias [56], with a total of 586 participants reported no significant differences in functional capacity comparing quetiapine ER or olanzapine with placebo.

We rated the certainty of evidence as moderate for a beneficial effect of assessed second-generation antipsychotics on general psychopathology and as moderate for no effect on functioning.

3.1.4 Incidence of Adverse Events, Serious Adverse Events, and Withdrawal Due to Adverse Events

The incidence of adverse events was generally higher in the groups that received second-generation antipsychotics compared with placebo groups [41, 44, 48, 50]. A random-effects meta-analysis showed a small but significantly higher risk of adverse events for participants treated with antipsychotics compared with placebo (67% vs 60%; risk ratio [RR] 1.10; 95% CI 1.00–1.21; Fig. 4). Common adverse events in participants treated with second-generation antipsychotics were dry mouth, constipation, dizziness, sedation, or weight gain.

Fig. 4
figure 4

Random effects meta-analysis of the incidence of adverse events comparing second-generation antipsychotics with placebo. CI confidence interval, REML restricted maximum likelihood (Black, 2014 [41]; Pascual, 2008 [50]; Schulz, 2008 [48]; Zanarini, 2011 [44])

Withdrawals due to adverse events were numerically higher for participants on second-generation antipsychotics than placebo (9% vs 6%) [32, 44, 45, 48, 50, 52, 56]. A random-effects meta-analysis, however, did not reach a significant difference (Supplementary Fig. 2, see ESM).

The incidence of serious adverse events, when reported, was numerically lower for second-generation antipsychotics than placebo. Sample sizes, however, were too small to detect rare but serious adverse events reliably.

We rated the certainty of evidence as moderate for a higher risk of adverse events with second-generation antipsychotics. The certainty of evidence was low for similar risks for withdrawal due to adverse events.

3.2 Anticonvulsants Versus Placebo

Nine double-blinded RCTs evaluated the efficacy of three anticonvulsant medications (divalproex sodium, lamotrigine, topiramate) [36, 37, 39, 40, 42, 49, 51, 53, 54]. Overall, these studies provided data on 523 participants. We rated two studies as low [36, 37], three as moderate [39, 40, 42], and four as high risk of bias [49, 51, 53, 54]. Reasons for ratings of high risk of bias were lack of intention-to-treat analysis and high attrition. Follow-up durations ranged from 8 to 52 weeks. Four trials were funded by the pharmaceutical industry [49, 51, 53, 54]; the others reported no funding or were supported by government or university funding. Studies, in general, excluded patients with psychiatric comorbidities. An exception, however, was the trial by Frankenburg et al., which included participants with BPD and bipolar disorder [54].

3.2.1 Severity of Borderline Personality Disorder

The publicly funded LABILE (Lamotrigine and Borderline Personality Disorder: Investigating Long-Term Effects) trial (N = 276) [42], rated moderate risk of bias, and a small, high risk of bias, industry-funded RCT (N = 28) [49] reported no significant differences on the Zanarini Rating Scale for BPD between participants in the lamotrigine (200–400 mg/day) and the placebo groups after 12 weeks of treatment. The primary endpoint of the LABILE trial was at 52 weeks, which also yielded no significant difference between treatment groups [42].

Likewise, a small, high risk of bias RCT (N = 15) found no significant differences on the Borderline Evaluation of Severity Over Time scale between participants on divalproex sodium ER (dosage not reported) or placebo after 12 weeks of treatment [51].

We rated the certainty of evidence as moderate for no effect of lamotrigine and as very low for no effect of divalproex sodium to reduce the severity of BPD.

3.2.2 Severity of Symptoms Associated with Borderline Personality Disorder

The LABILE trial reported no significant differences in alcohol or other substance use, and self-harm between participants treated with lamotrigine or placebo [42]. The other trials reported mostly favorable findings regarding the efficacy of anticonvulsants to reduce anger, aggression, and affective lability. However, studies were small, with mostly high risk of bias, and chance findings are likely.

Four RCTs consistently reported significant reductions in anger (mostly measured on the State-Trait Anger Expression Inventory) for divalproex sodium (N = 30) [54], lamotrigine (N = 27) [36], and topiramate (N = 31 and 44) after 8–12 weeks of treatment [39, 40]. Of these four trials, only the one comparing divalproex sodium with placebo reported treatment effects using the subscale for anger and hostility of the Symptom Checklist-90-Revised (0.8 vs 0.6; p = 0.01) [54].

Likewise, divalproex sodium improved aggression in two small, high risk of bias trials (N = 30 and N = 16) [53, 54] after 10 and 24 weeks, but the difference reached significance in only one RCT (Modified Overt Aggression Scale: 3.0 vs 1.9; p = 0.03) [54].

A small RCT with 28 participants, rated high risk of bias, reported significantly greater reductions of affective lability for lamotrigine compared with placebo (Affective Lability Scale: 0.71 vs 0.4; p = 0.012) [49].

Another small, high risk of bias RCT (N = 15) reported no significant differences between participants on divalproex sodium ER or placebo to improve impulsiveness (Barratt Impulsiveness Scale) after 12 weeks of treatment [51].

We rated the certainty of evidence as low for divalproex sodium, lamotrigine, and topiramate to improve anger; as very low for divalproex sodium to reduce aggression; and as very low for lamotrigine to improve affective lability. The certainty of evidence was very low for no beneficial effect of lamotrigine on impulsiveness.

3.2.3 General Psychopathology and Functioning

One RCT (N = 56), rated low risk of bias, assessed the efficacy of topiramate (titrated from 50 to 200 mg/day) in women with BPD [37]. After 10 weeks, participants in the topiramate group had significantly greater improvements on the Global Severity Index of the Symptom Checklist-90–Revised (7.4 vs 1.8; p < 0.001) [37].

The LABILE trial [42] and two small high risk of bias RCTs (N = 16 and N = 15) [51, 53] reported no significant differences between lamotrigine and placebo on the Social Functioning Questionnaire after 52 weeks and between divalproex sodium and placebo on the Symptom Checklist-90–Revised after 10 and 12 weeks of treatment [51, 53].

We rated the certainty of evidence as low for topiramate to improve general psychopathology, as moderate for no effect of lamotrigine, and as very low for no effect of divalproex sodium to improve social functioning.

3.2.4 Incidence of Adverse Events, Serious Adverse Events, and Withdrawal Due to Adverse Events

None of the trials assessing divalproex sodium or topiramate reported on the incidence of adverse events and serious adverse events. The incidence of adverse events and serious adverse events was similar between lamotrigine and placebo treatment groups [42, 49].

A meta-analysis of anticonvulsant medications as a class rendered no significant differences in withdrawals because of adverse events after 8–52 weeks of treatment (3% vs 5%; Supplementary Fig. 3, see ESM).

We rated the certainty of evidence as low for similar risks of adverse events and serious adverse events between lamotrigine and placebo, and as very low for similar risks for withdrawal due to adverse events.

3.3 Antidepressants Versus Placebo

One industry-funded, high risk of bias RCT (N = 25) assessed differences in efficacy between fluoxetine (20–40 mg/day) and placebo [47]. The study duration was 12 weeks. All trial participants were female and received individual dialectical behavioral therapy. The study did not assess changes in the severity of BPD or the incidence of adverse events. In addition, we located one unpublished RCT, which added fluoxetine (20–80 mg/day) or placebo to dialectical behavioral therapy or supportive psychotherapy for participants with BPD and suicidal behavior or self-mutilation [58] (N = 75). The study duration was 12 months. We did not formally include this study, because the methodological information provided on ClinicalTrials.gov was insufficient for risk of bias assessment.

3.3.1 Severity of Symptoms Associated with Borderline Personality Disorder

In the published RCT, after 12 weeks, treatments groups did not reveal any significant differences in anger or aggression [47]. For the unpublished trial, data on suicide attempts was available on ClinicalTrials.gov but without statistical analysis [58]. A chi-squared 2-by-k independence test revealed no statistically significant differences in suicide attempts between treatment groups with or without fluoxetine.

We rated the certainty of evidence as very low for no effect of fluoxetine on anger and aggression.

3.3.2 General Psychopathology and Functioning

No differences in functioning could be detected between treatment groups after 12 weeks [47].

We rated the certainty of evidence as very low for no effect of fluoxetine on functioning.

3.4 Second-Generation Antipsychotics Versus Antidepressants

One industry-funded RCT (N = 45), rated moderate risk of bias, assessed differences in efficacy between olanzapine (2.5–7.5 mg/day), fluoxetine (10–30 mg/day), and a combination of fluoxetine and olanzapine [43]. The study duration was 8 weeks. All trial participants were female. The study did not report on the severity of BPD, on general psychiatric symptoms, or adverse events.

3.4.1 Severity of Symptoms Associated with Borderline Personality Disorder

After 8 weeks, participants treated with olanzapine or a combination of olanzapine and fluoxetine had significantly greater improvements in aggression (Modified Overt Aggression Scale: 19.7 vs 20.2 vs 15.4; p < 0.01) and depressive symptoms (Montgomery-Åsberg Depression Rating Scale: 13.6 vs 11.9 vs 8.2; p < 0.001 and p = 0.02) than participants treated with fluoxetine alone [43].

We rated the certainty of evidence as low for a greater effect of olanzapine or a combination of olanzapine and fluoxetine to reduce aggression and depressive symptoms than fluoxetine monotherapy.

3.4.2 Incidence of Adverse Events, Serious Adverse Events, and Withdrawal Due to Adverse Events

The study did not report data on the incidence of adverse or serious adverse events. Only two participants (one in the fluoxetine and one in the olanzapine plus fluoxetine group) withdrew because of adverse events.

We rated the certainty of evidence as very low for similar risks of withdrawals due to adverse events.

3.5 Second-Generation Antipsychotics versus Second-Generation Antipsychotics

One, 12-week RCT (N = 51), rated high risk of bias, assessed differences in efficacy between asenapine (5–10 mg/day) and olanzapine (5–10 mg/day) [55]. Authors did not report any funding. The study did not report on general psychiatric symptoms and functioning.

3.5.1 Severity of Borderline Personality Disorder

After 12 weeks, there was no significant difference on the BPD Severity Index between the asenapine and olanzapine groups [55 103].

We rated the certainty of evidence as very low for similar effects of asenapine and olanzapine.

3.5.2 Severity of Symptoms Associated with Borderline Personality Disorder

After 12 weeks, there were no significant differences in aggression, impulsiveness, and self-harm between the asenapine and olanzapine groups [55].

We rated the certainty of evidence as very low for similar effects of asenapine and olanzapine.

3.5.3 Incidence of Adverse Events, Serious Adverse Events, and Withdrawal Due to Adverse Events

The incidence of adverse events and withdrawal because of adverse events were similar between treatment groups. The study did not report data on the incidence of serious adverse events [55].

We rated the certainty of evidence as very low for similar risks of adverse events and withdrawal due to adverse events between asenapine and olanzapine.

4 Discussion

Our study is the largest attempt to date to assess the general efficacy, comparative effectiveness, and risk of harms of pharmacotherapies for the treatment of patients with BPD. To our knowledge, no systematic review on this topic has been conducted over the past 10 years, apart from a focused update of an out-of-date Cochrane review [29, 30].

Overall, the available evidence indicates that the efficacy of pharmacotherapies for the treatment of BPD is limited. In clinical trials, second-generation antipsychotics, anticonvulsants, and antidepressants did not reduce the severity of BPD. Low- and very-low-certainty evidence indicates that anticonvulsants can improve anger, aggression, and affective lability, however, the evidence is mostly limited to single studies. Second-generation antipsychotics had little effect on the severity of specific symptoms that are commonly associated with BPD but they improved general psychiatric symptoms. None of the pharmacotherapies had a positive effect on functioning. The evidence on comparative effectiveness and harms was limited to two small RCTs [43, 55]. Olanzapine appeared to be more effective than fluoxetine in improving aggression and depressive symptoms but did not differ significantly from asenapine. Given that most of these findings are based on evidence of low or very low certainty, the findings should be viewed cautiously.

Despite the limited evidence supporting the benefits of pharmacotherapies, clinical practice guidelines provide mixed recommendations [22]. Some professional societies cautiously recommend the off-label use of psychotropic agents as part of a multimodal approach [26, 27]. By contrast, the National Institute for Health and Clinical Excellence (NICE) in the United Kingdom and the Australian National Health and Medical Research Council [24, 25] recommend avoiding pharmacotherapies as first-line treatments except in acute crises.

This review and the underlying evidence base have several limitations. First and most importantly, our findings are characterized by a lack of evidence for most drugs and a scarcity of high-quality evidence for the remaining eligible medications. We found controlled studies for only nine out of 87 medications of interest. Olanzapine (total N = 881) and lamotrigine (total N = 304) had the largest evidence base. Available studies were often small and of high risk of bias. Only four, out of 21 included RCTs, enrolled > 70 participants. In small studies, the risk for chance findings is high, particularly when investigators assess a large number of outcomes. Furthermore, most studies assessed only short-term follow-up of 8–12 weeks. Whether longer treatment would lead to more beneficial effects remains unclear for most pharmacological interventions. As noted previously, the main reason for high risk-of-bias ratings was high attrition of the study population. Up to 68% of participants discontinued the trials. Although high attrition is typical for populations with difficult-to-treat psychiatric disorders such as patients with BPD, it poses a serious methodological threat to the validity of results. Withdrawal of participants from a study is usually not at random but caused by underlying reasons that are linked to the course of the disease, a lack of efficacy of treatments, or the incidence of adverse events. Consequently, the certainty of evidence for most outcomes was low or very low, indicating that future studies might have a substantial impact on the estimates of effect. We were unable to find any controlled studies on benzodiazepines, melatonin, opioid agonists or antagonists, or sedative-hypnotic medications.

Second, trial populations were limited to populations who were mostly female and white. Not a single trial enrolled adolescents with BPD or assessed differences in subgroups based on gender, age, race or ethnicity. Studies usually excluded patients with axis I comorbidities, such as mood, anxiety, or substance use disorders, which are common among patients with BPD. Therefore, we cannot gauge the generalizability of our findings to other populations. Third, we could not find any large, long-term observational studies that met our inclusion criteria. As RCTs have limitations when it comes to the assessment of rare but serious adverse events, it is conceivable that observational studies might uncover risks of harms that RCTs could not detect.

Methodological limitations of our systematic review are restricting the eligibility to studies published in the English language and potential publication bias. Methods research [54] indicates that the restriction to English language publications can introduce language bias although the impact on effect estimates and conclusions is generally small [55]. Publication bias and selective outcome reporting are potential limitations of any systematic review. Although we searched for unpublished literature, the extent and impact of publication and reporting bias in this body of evidence is impossible to ascertain. We also limited study populations to those diagnosed with DSM-IV or later to mitigate heterogeneity across studies. Consequently, we excluded some early trials from our systematic review.

5 Conclusions

Despite the common use of pharmacotherapies for patients with BPD, only low-quality evidence is available to guide clinicians. Overall, the efficacy of pharmacotherapies to improve BPD is limited to improvement of individual symptoms but not the condition overall. Even for the improvement of symptoms, the certainty of evidence is low. Future research needs to conduct unbiased, adequately powered trials that take potential differences in subgroups into consideration and focus on patient-relevant health outcomes, such as social functioning or clinically important improvements of symptoms that matter most to patients with BPD.