1 Introduction

Ensuring service quality and patient safety in Emergency Departments (EDs) requires accurate capacity planning. Since emergency services are labor-intense processes (Sampson 2001), effectively managing ED capacity implies, first and foremost, developing well-devised staff rosters.

In general, solving a rostering problem requires creating a work schedule that: (1) allows meeting a time-dependent demand for service, (2) complies with regulatory constraints and work-place agreements, (3) attempts to satisfy individual staff constraints and preferences (Ernst et al. 2004a).

Dealing with rostering problems for ED staff is, in general, complex. It is even more so if the resource to be scheduled are physicians.

First, EDs operate 24/7 and working certain shifts is less desirable than others (e.g. weekend and night shifts). It creates the need to evenly distribute these undesirable shifts across staff members and to communicate largely in advance to the staff the “undesirable” shifts they are expected to cover (e.g. the weekends on duty) to allow them better organize their personal life.

Second, demand is highly variable and seasonal. Arrivals rates vary according to the season (fall/winter vs spring/summer months) with the day of the week (weekdays vs weekends) and with the hour of the day (morning, afternoon, night) (Visintin et al. 2019). This makes it complex to understand the number of people that should be on duty for each time-shift and, again, makes certain shifts less desirable than others (e.g. if arrival rates are higher in the afternoons than in the mornings, afternoon shifts will be perceived as more stressful).

Third, the labor force in ED is made of highly skilled professionals - nurses, radiologists, physicians—working in stressful conditions, under articulated work contracts. These contracts set strict constraints on the number of consecutive hours that can be worked, on the weekly and monthly workload, on the rest hours between shifts, etc. Physicians, in particular, have individual contractual agreements (contrary to nurses who generally work under collective contracts) setting additional constraints e.g. on the type of shifts they can cover (weekdays vs. weekend shifts, night vs. day shifts) or on the number of hours they are expected to work inside or outside ED.

Fourth, physicians (more than other staff-members) must coordinate their activities within the ED with other activities outside the ED (outpatient clinics, operating theatre, etc.). This results in the need for accommodating physicians’ requests to work or not to work certain shifts, which can be formulated on short notice. Also, given their prominence on the hospital hierarchy, physicians may express preferences (driven by work-related issues or personal needs), concerning the shift to cover and expect these preferences to be accommodated as much as possible.

Fifth, ED physicians’ overall capacity is usually scarce, making them bottleneck resources (Visintin et al. 2019).

In the literature, staff planning problems are generally subdivided in staff dimensioning and staff rostering problems (Ernst et al. 2004a). The former concerns the determination of the number and type of sta needed to meet the demand for each time-shift. The latter is based on the solution of the former and concerns the assignment of individual sta members to each shift.

In this study, we address the Emergency Department Physician Rostering Problem (EDPRP).

Hence, we assume that the number of physicians needed in each time-shift (morning, afternoon, pre-night and night) of a 6-month planning horizon is known, i.e., we assume that the staff dimensioning problem has already been solved. Thus, we address the problem of selecting the physicians to assign to each shift, with the objective of maximizing physicians’ satisfaction and perceived equity.

Specifically, we propose a two-phase approach to the EDPRP. To implement such an approach, we develop two Integer Linear Programming (ILP) models, one for each phase. The problems addressed in the two phases refer to different levels of decision-making, as such, they differ in terms of planning horizon, objective functions, and constraints. Yet, both of them are modeled as multicommodity ow problems and share the same network structure. In the first phase, we address the problem of determining the weekends that each physician will be on duty over a long term (6-months) planning horizon, with the objective of evenly distributing the worked weekends and the weekend night shifts across physicians. In the second phase, we address the problem of evenly distributing morning and afternoon shifts across physicians over a medium term planning horizon (1 month), while at the same time fixing upper bounds on the number of hours worked and on the number of night shifts covered, every month, by each physician. In addressing this problem, we take into account that weekend shifts have already been assigned in the first phase and treat them as constraints. The solution of the first phase is used as input for the second phase. The model in the second phase is iteratively run six times to cover the whole planning horizon. In both phases we assume that physicians can ask, in advance, to work or not to work certain shifts. These requests may be formulated in prescriptive terms (hard requests) or as desiderata (soft requests). In both models, hard requests are modeled as hard constraints and soft requests as soft constraints. The ultimate objective of this approach is to maximize shift equity while at the same time fulfilling all the hard requests and as many soft requests as possible. Shift equity and customization, in fact, are proven to drive employee satisfaction and motivation (Brunner et al. 2009; Stolletz and Brunner 2012; Gross et al. 2019) and, consequently, service quality.

The proposed two-phase approach carries several advantages: (1) physicians are informed, well in advance, about the weekends when they are supposed to be on duty; (2) it is possible to fulfill (hard and soft) physicians’ requests to work or not to work on certain shifts thereby helping them planning their activities outside the ED (e.g. the attendance of conferences and meetings); (3) it is possible to accommodate medium-term physicians’ requests, physiologically emerging every month, without incurring in schedule disruptions; (4) it is possible to obtain solutions that are well-balanced both in the long- and in the medium-term.

To handle the computational complexity arising in the first phase, when the planning horizon is longer and the resulting optimization problem is harder to solve, we have also developed a set of valid inequalities that allow strengthening the linear programming relaxation of the corresponding optimization model. These inequalities dramatically reduce the computational time, thereby allowing us to find feasible solutions also for instances that would not have otherwise been possible to solve within the time limit.

The presented approach has been tested on real data from a leading European Hospital and on benchmark instances from the literature. The solutions returned by the optimization models have been evaluated through six key performance indicators purposely defined. Results demonstrate that the presented models allow considering the complex nature of physicians rostering problems and obtaining well-balanced and thus equitable work schedules.

The manuscript is organized as follows: in Sect. 2, we provide an overview of the relevant literature and we identify the research gaps this study aims to fill-in; in Sect. 3, we present the optimization models; in Sect. 4, we present the numerical results of the application of the proposed approach in terms of computational performance (Sect. 4.1) and solution quality (Sect. 4.2). In Sect. 5, we discuss the application of the optimization models to benchmark instances, both in terms of computational performance and solution quality (Sect. 5.2). In Sect. 6, we draw some conclusions. Finally, two appendices conclude the work. Specifically, “Appendix A” provides a description of the KPIs used in Sect. 4, while “Appendix B” reports the complete descriptive statistics of the KPIs.

2 Literature overview

Rostering problems have been the object of a large number of contributions both in the scientific and practitioner-oriented literature (Van den Bergh et al. 2013). Applications contexts include call centers, transportation systems such as airlines and railways, emergency services such as police, ambulance and fire brigade, and, of course, emergency medical services (Ernst et al. 2004a, b).

In emergency medical services, staff rostering problems usually involve nurses and/or physicians. However, while nurse rostering has been extensively studied (Burke et al. 2004), physician rostering (Erhard et al. 2018) has received limited attention (Adams et al. 2019).

This is because physician rostering has some peculiarities that make it different and more complex than other personnel scheduling problems. As pointed out in the introduction, ED physicians are understaffed, highly trained, specialized, and thus difficult to replace (Erhard et al. 2018). They have personalized work contracts (Bruni and Detti 2014; Erhard 2021), they need to coordinate their activities across different departments within and outside the hospital (Fügener et al. 2015) and they tend to express, with short notice, preferences concerning the shift to work and not to work, and expect these preferences (Brunner et al. 2009) to be accommodated.

As for the objective to purse in defining the staff roster, the literature acknowledges that ensuring shift equity is crucial, as equity drives staff satisfaction and consequently service quality (Dewa et al. 2017). This is particularly true for burnout exposed personnel such as physicians (Adams et al. 2019; Stolletz and Brunner 2012; Zhong et al. 2017). Balancing personnel workload is one of the most commonly used approaches to guarantee equity (Cappanera and Scutellà 2011; Cappanera et al. 2014). Particularly demanding shifts such as night or weekend shifts should be fairly distributed among physicians and the same should hold for their desiderata. Unbalanced schedules are thus usually penalized via soft constraints as in Erhard et al. (2018) or upper bounds on the value of workload indicators are set, as is the case in Fügener et al. (2015) for the number of night shifts. Sometimes (Adams et al. 2019) workload balance is considered jointly with patient-related constraints such as those ensuring continuity of care to patients, i.e., the same set of physicians assigned to the same patient. Often, cyclic schedules are used to guarantee an even distribution of workload in the long run. A cyclic schedule consists of a fixed sequence of shifts which is rotated among a group of workers over a planning horizon. Thus, each worker rotates equally, through desirable and undesirable shifts and cyclic schedule is perceived as unbiased (Millar and Kiragu 1998). As an example, cyclic schedules are used in Ferrand et al. (2011) for ED physicians. However, cyclic schedules might prevent the satisfaction of individual preferences and constraints (Knust and Xie 2019) consequently, they are not suitable for ED physicians. Alternatively, the balancing of some performance indicator, in the long run, is pursued by subdividing the planning horizon into shorter time periods and dynamically adjusting the indicator in one period on the basis of the value the indicator assumed in the previous period. This is the case in Gross et al. (2019) for individual preference satisfaction, and in Gross (2018) for both workload balance and individual preference satisfaction in an anesthesiology department.

The literature thus suggests that successfully addressing ED physician rostering problems requires taking into account the heterogeneity of their work contracts, their preferences, as well as the fact that it is impossible to schedule their work shifts too in advance without incurring in schedule disruptions. Moreover, it suggests that the roster should ensure, first and foremost, shift equity.

To address the intrinsic complexity of the resulting rostering problems, many contributions from the recent literature address simplified versions of them, as it happens in two-phase approaches and relaxation-based approaches. In two-phase approaches, usually, shifts are organized into two groups and assigned separately in each phase. As an example, day-offs are assigned in the first phase, and then the resulting solution is used as the input of the second phase in which working shifts are assigned (e.g. Valouxis et al. 2012; Zhong et al. 2017). It is worth to point out, however, that in these studies the two-phase approach is considered as a mean to reduce the computational complexity of problems that would not otherwise be possible to solve in one shot, rather than, as in our study, as a tool to optimally address problems at a different level of decision-making and to manage balancing criteria with different scopes.

In relaxation-based approaches, instead, complicating constraints are relaxed and then introduced dynamically according to branch and cut algorithms. This is the case in Bard and Purnomo (2005) for nurse rostering, in Damcı-Kurt et al. (2019) for a physician rostering problem arising in an acute care hospital, in Brunner and Edenharter (2011) in an anesthesia department, and in Bruni and Detti (2014) for the medical guard services. Another stream of research proposes heuristic-based approaches: a heuristic based on assigning physicians to sets of tasks instead of to shifts (Gunawan and Lau 2013), a genetic algorithm for physician rostering in ED (Puente et al. 2009), metaheuristics (Wong et al. 2014), and matheuristics (Doi et al. 2018; Wickert et al. 2020) are not exhaustive examples of this type of contributions. Interestingly, (Van Huele and Vanhoucke 2014) addresses a combined emergency and surgery scheduling problem.

To the best of our knowledge, however, there is no contribution proposing optimization models able to solve both long and medium-term, real-life ED physician rostering problems while at the same time capturing all the features characterizing the EDPRP. This paper aims at filling in this gap. This paper contributes to the EDPRP literature in three ways:

First, it proposes a model able to solve both long and medium-term, real-life EDPRPs while at the same time capturing all the features characterizing them. The literature does propose physician rostering models rich in features but due to the resulting computational complexity these models are applied to short- to medium term, typically 1 month. On the contrary, studies considering longer planning horizons, propose simpler models not considering some of the features included in our models. Second, this paper proposes a novel set of KPIs to measure shift equity in EDPRP. These KPIs allow assessing the multifaceted nature of shift equity in ED, both in the long and medium term. Third, this study reports the result of the application of the proposed models to benchmark instances taken from the literature. To the best of our knowledge, there are no studies proposing models developed to address complex and real-life EDPPR, that have also been tested on publicly-available data relevant to settings different from the one that inspired the study.

3 The optimization models

Both the rostering problems characterizing the two-phase approach are modeled as multicommodity flow problems and share the same network structure, even though they have specific objectives and constraints.

In presenting the optimization models, we highlight their common structure. Specifically, they are both defined on a layered acyclic network which is a tool already used in the literature for rostering problems (Cappanera and Gallo 2004; Cappanera et al. 2020). In the layered network, layers correspond to days in the planning horizon, and nodes in each layer correspond to the set of shifts that must be covered on the corresponding day. The layered network is exemplified in Fig. 1 for a planning horizon of one week, from Monday to Sunday. In the example, there are 4 shifts on each day, three of them are work shifts (shift 1, shift 2 and shift 3) and the last one is a rest shift. Weekday shifts (light grey) are distinguished from weekend shifts (dark grey). Arcs in the network connect (1) the source node to all the shift nodes in the layer corresponding to the first day of the planning horizon (Monday in the example); (2) consecutive layers in the network, and (3) shift nodes in the layer corresponding to the last day of the planning horizon (Sunday in the example) to the destination node.

Fig. 1
figure 1

The layered graph for a one-week planning horizon

For each physician, the sequence of activities assigned to them in the planning horizon is defined by a path in the layered network from a source dummy node to a destination dummy node. This path, by graph construction, visits exactly one node (shift) in each layer (day) and thus allows to identify the unique shift—either a work shift or a rest shift, assigned to them on a specific day of the planning horizon. There are however specific situations in which a physician is allowed to work more than one shift per day: in that case, the graph is modified accordingly by adding arcs linking shift nodes within the same day.

In network terminology, each physician corresponds to a specific flow, namely a commodity. Thus, individual constraints characterizing a specific physician can be imposed on the corresponding flow. Flows of different physicians are separated the one from the other, but they all share the same network structure, thus making it possible to guarantee global constraints in addition to individual constraints.

Besides admitting a personalized network for each physician, the multicommodity structure allows us to manage implicitly compatibility constraints between physicians and shifts and between consecutive shifts. As an example, after a night shift a rest shift must follow, and consequently the unique arc leaving a node corresponding to a night shift enters a node corresponding to a rest shift in the next layer. Similarly, if two shifts cannot be assigned consecutively to a physician, in the corresponding network the arc between the two shifts is not inserted for any couple of consecutive days in the planning horizon. Finally, if a physician is not allowed to work a specific shift, in the corresponding graph there will not be any arc entering that node in any of the layers. Thus, the set of nodes and arcs in the graph depends on the physician.

The two models corresponding to the two phases share also two additional features: (1) the management of desiderata expressed by physicians, and (2) the management of undesirable sequences of shifts. Concerning desiderata, a physician may ask that a certain shift is assigned on a certain day (to work) or not assigned (not to work), and that their request is strict (managed via a hard constraint) or granted only if it does not deteriorate the optimal solution (managed via a soft constraint). As usual, a penalty is associated with each soft constraint and the objective function minimizes the number (or weight) of not granted preferences. Also, the models allow us to consider undesirable sequences of shifts. There are, in fact, pairs of shifts that are undesirable when worked one after the other. Even though their assignment to the same physician is feasible, it should be discouraged. As an example, the assignment of a rest shift after a night-call duty shift is not welcome by physicians. Then, the set of undesirable pairs of shifts is defined and the assignment of any couple of shifts in the set to the same physician is discouraged in the objective function through a penalty.

As an overview, Fig. 2 sketches the two-phase approach in terms of objective and constraints characterizing each phase.

The remaining part of the section is organized as follows. First, notation as well as common variables and constraints of the two optimization models are discussed (Sect. 3.1). Then, the two models relevant respectively to the first and the second phase of the approach are introduced separately (Sects. 3.2 and 3.3) highlighting their peculiarities in terms of objectives and constraints. These two models are referred to as base models. Finally, in Sect. 3.4, we describe a set of valid inequalities which can be used to strengthen the linear programming relaxations of the base models. The resulting models are referred to as enhanced models.

Fig. 2
figure 2

An overview of the two-phase approach

3.1 Notation and common constraints

Let us denote with:

$$\begin{aligned} \begin{array}{ll} H &{} \text {set of physicians} \\ \overline{H} \subseteq H&{} \text {subset of part-time physicians} \\ S &{} \text {set of shifts} \\ S_m \subseteq S &{} \text {subset of morning shifts} \\ S_a \subseteq S &{} \text {subset of afternoon shifts} \\ S_n \subseteq S &{} \text {subset of night shifts} \\ S_r \subseteq S &{} \text {subset of rest shifts}\\ S_s \subseteq S &{} \text {subset of night-call duty shifts} \\ U = \{(i,j) \text { s.t. } i,j \in S\} &{} \text {set of undesirable couples of consecutive shifts} \\ D= \{1, \ldots , |D|\} &{} \text {set of days to be considered - extended planning horizon consisting of }|D|\hbox { days} \\ L=D \cup \{0\}&{} \text {set of levels - except the last} \\ \overline{D} \subseteq D &{} \text {subset of days corresponding to the planning horizon (possibly different from }{D}\\ &{} \text {due to extension on the left and on the right)} \\ M \subseteq D &{} \text {set of days corresponding to Mondays}\\ W &{} \text {set of weekends in the planning horizon - each }w \in W\hbox { is a subset of }D \\ G^h=(N^h,A^h) &{} \text {graph relative to physician }h\hbox { with node set }N^h\hbox { and arc set }A^h \\ o^h \in N^h&{} \text {origin node for physician }h;\hbox { by default }o^h\hbox { belongs to level 0}\\ d^h \in N^h&{} \text {destination node for physician }h;\hbox { by default }d^h\hbox { belongs to level }|D|+1\\ \Delta _{v,t}^h=\{(i,l) \text { s.t. } i \in S, l \in D\} &{}\text {with }v \in \{0,1\}, t \in \{P,F\}\hbox { set of desiderata for physician }h, \hbox {expressed as couples} \\ &{}(i,l)\text { of shift-day which have to be avoided }(v=0)\hbox { or done }(v=1)\\ &{}\text {in a soft }(t=P)\hbox { or hard way }(t=F) \\ \overline{n}^h &{} \text {maximum monthly number of night shifts physician }h\hbox { can work} \\ \overline{w}^h &{} \text {maximum weekly workload for physician }h\hbox { - expressed in hours} \\ \overline{m}^h &{} \text {maximum monthly workload for physician }h\hbox { - expressed in hours} \\ \overline{s}^h &{} \text {monthly number of night-call duties for physician }h \\ \overline{b}^h &{}\text {monthly number of working hours for part-time physicians} \\ \underline{d} &{} \text {minimum number of days that must elapse between two night shifts} \\ c_{il} &{}\text {number of physicians required in day }l\hbox { on duty }i \\ w_i &{} \text {workload of shift } i \in S\\ \overline{M} &{} \text {big-M value}\\ \alpha _c \quad \quad \quad &{}\text {weight used in the objective functions to discriminate criterion }c. \end{array} \end{aligned}$$

Then, let us define the following families of variables in order to model the decisions:

$$\begin{aligned} x^h_{iljl+1} = \left\{ \begin{array}{ll} 1 &{} \text{ if } \text{ shift } i \text{ in } \text{ level } l \text{ and } \text{ shift } j \text{ in } \text{ level } l+1 \\ &{} \text{ are } \text{ assigned } \text{ consecutively } \text{ to } \text{ physician } h \\ 0 &{} \text{ otherwise } \end{array} \right. \end{aligned}$$

for \(h \in H\), \(l \in L\), \((i_l,j_{l+1}) \in A^h\), and

$$\begin{aligned} \epsilon _{il}^h = \left\{ \begin{array}{ll} 1 &{} \text{ if } \text{ the } \text{ preference } \text{ expressed } \text{ by } \text{ physician } h \text{ for } \text{ shift } i \\ &{} \text{ on } \text{ day } l \text{ is } \text{ not } \text{ satisfied } \\ 0 &{} \text{ otherwise } \end{array} \right. \end{aligned}$$

for \(h \in H\), \(l \in \overline{D}\), \(i \in S\).

In the following, we first describe, in terms of variables and constraints the common features of the two rostering problems and then their distinguishing features. Using the variables and notation above, the constraints common to both the models are the following:

$$\begin{aligned}&\sum _{j \in S}x_{o^h0j1}^{h} = 1&\forall h \in H \end{aligned}$$
(1)
$$\begin{aligned}&\sum _{j \in S}x_{j|D|d^h |D|+1}^{h} = 1&\forall h \in H \end{aligned}$$
(2)
$$\begin{aligned}&\sum _{j \in S \cup o^h}x_{jl-1 il}^{h} - \sum _{j \in S \cup d^h}x_{il jl+1}^{h}= 0&\forall h \in H, \forall l \in D, \forall i \in S \end{aligned}$$
(3)
$$\begin{aligned}&\sum _{h \in H}\sum _{j \in S \cup o^h}x_{jl-1 il}^{h} = c_{il}&\forall l \in \overline{D}, \forall i \in S \end{aligned}$$
(4)
$$\begin{aligned}&\sum _{j \in S \cup o^h}x_{jl-1 il}^{h} = 0&\forall h \in H, \forall (i,l) \in \Delta _{0F}^{h}\end{aligned}$$
(5)
$$\begin{aligned}&\sum _{j \in S \cup o^h}x_{jl-1 il}^{h} = 0 + \epsilon _{il}^h&\forall h \in H, \forall (i,l) \in \Delta _{0P}^{h}\end{aligned}$$
(6)
$$\begin{aligned}&\sum _{j \in S \cup o^h}x_{jl-1 il}^{h} = 1&\forall h \in H, \forall (i,l) \in \Delta _{1F}^{h}\end{aligned}$$
(7)
$$\begin{aligned}&\sum _{j \in S \cup o^h}x_{jl-1 il}^{h} = 1 - \epsilon _{il}^h&\forall h \in H, \forall (i,l) \in \Delta _{1P}^{h}\end{aligned}$$
(8)
$$\begin{aligned}&x_{iljl+1}^{h} \in \{ 0,1\}&\forall h \in H, \forall l \in L, \forall (i_l,j_{l+1}) \in A^h\end{aligned}$$
(9)
$$\begin{aligned}&\epsilon _{il}^h\in \{ 0,1\}&\forall h \in H, \forall v \in \{0,1\} \forall (i,l) \in \Delta ^h_{vP} \end{aligned}$$
(10)

The \(x^h_{iljl+1}\) variables define the schedule for each physician h: for each physician, flow conservation constraints (1)–(3) impose that a path is determined from the origin \(o^h\) to the destination \(d^h\) visiting exactly one node (one shift) in each layer (each day). Constraints (4) guarantee demand coverage, while constraints (5)–(8) manage desiderata as hard constraints—(5) and (7), or as soft constraints—(6) and (8). Finally, the rest of the constraints define variable domain (0-1 variables).

3.2 First-phase base model

The model in the first phase relies on a layered graph which has a layer for each day in the planning horizon but decisions are taken only for the days corresponding to the weekend days; without loss of generality, we assume that the rest shift is assigned to all of the physicians on each weekday; in fact, the decisions relative to the activities assigned on weekdays are postponed to the second phase. Coming back to the example, the graph used in the first phase is shown in Fig. 3, where a solution for a given physician is also shown with a thick black line. Specifically, according to the solution of the model, that particular physician is assigned shift 2 on Friday, a rest shift on Saturday, and shift 1 on Sunday.

Fig. 3
figure 3

First phase

In addition to variables x and \(\epsilon\) which are common to both rostering problems, the weekend shift management problem (first phase) makes use of the following peculiar variables:

  • Y the maximum number of weekends on duty assigned to a single physician

  • V the maximum number of night shifts assigned to a single physician

  • \(y^h_{w} = \left\{ \begin{array}{ll} 1 &{} \text{ if } \text{ physician } h \text{ is } \text{ on } \text{ duty } \text{ in } \text{ weekend } w \\ 0 &{} \text{ otherwise } \end{array} \right.\)    \(h \in H\), \(w \in W\).

The rostering problem addressed in the first phase is the following:

$$\begin{aligned} \min \quad&\alpha _y Y +\alpha _v V + \alpha _x\sum _{l=1}^{|D|-1} \sum _{(i,j) \in U} x_{iljl+1}^h +\alpha _\epsilon \sum _{h \in H}\sum _{v \in \{0,1\}}\sum _{(i,l) \in \Delta ^h_{vP}}\epsilon _{il}^h \end{aligned}$$
(11)
$$\begin{aligned}&(1)--(10) \nonumber \\&\sum _{l\in \overline{D}}\sum _{i\in S_n}\sum _{j\in S\cup d^h}x_{iljl+1}^h \le V&\forall h \in H \end{aligned}$$
(12)
$$\begin{aligned}&\sum _{l\in w}\sum _{i\in S \setminus S_r}\sum _{j\in S\cup d^h}x_{iljl+1}^h \le \overline{M} y^h_{w}&\forall h \in H, \forall w \in W\end{aligned}$$
(13)
$$\begin{aligned}&\sum _{w \in W} y^h_{w} \le Y&\forall h \in H \end{aligned}$$
(14)
$$\begin{aligned}&y^h_{w_i} + y^h_{w_{i+1}} \le 1&\forall h \in H, \forall i= 1, \ldots , |W|-1\end{aligned}$$
(15)
$$\begin{aligned}&\sum _{l\in w}\sum _{i\in S_n}\sum _{j\in S\cup d^h}x_{iljl+1}^h \le 1&\forall w \in W, \forall h \in H \end{aligned}$$
(16)
$$\begin{aligned}&\sum _{i\in S}\sum _{j \in S}x_{iljl}^h \le 2&\forall w \in W, \forall l \in w, \forall h \in H\end{aligned}$$
(17)
$$\begin{aligned}&y^h_{w} \in \{0,1\}&\forall h \in H, \forall w \in W \end{aligned}$$
(18)

The balancing criteria used in this phase hierarchically minimize the maximum number Y of weekends on duty and the maximum number V of night shifts among the physicians. The other two terms in the objective function discourage respectively undesirable sequences of activities and unsatisfied desiderata. Variable \(y^h_{w}\) takes value one (see 13), if h is assigned a work shift in any day of weekend w, i.e., if h is on duty in weekend w. The model is characterized by the set of constraints (1)–(10) which are common to both the phases and by peculiar constraints explained in the following. Specifically, constraints (12) and (14) guarantee the correctness of the value assumed by variables Y and V, whereas constraints (15) assure that a physician cannot work two consecutive weekends. Constraints (16) limit the maximum number of night shifts to one in each weekend of the planning horizon, for each physician h. Constraints (17) assure that, for each physician h, on each day of each weekend in the planning horizon, at maximum two consecutive working shifts are allowed. In fact, during the weekend more than one shift can be assigned on the same day, whereas during the weekdays, the maximum number of working shifts that can be assigned to a physician is one. In the second phase this kind of constraint is guaranteed implicitly by graph construction. Constraints (18) define the \(y^h_{w}\)’s domain.

3.3 Second-phase base model

The model in the second phase relies on a layered graph which has a layer for each day in the planning horizon but decisions relative to the weekend days are fixed according to the solution provided by the model in the first phase. Again, coming back to the example, the graph used in the second phase is shown in Fig. 4 in which the solution of the first phase (shift 2 on Friday, a rest shift on Saturday and shift 1 on Sunday) is fixed. The model in the second phase takes decisions on Monday to Thursday and it provides the solution shown with a thick black line. Specifically, that particular physician is assigned a rest shift on Monday, shift 3 on Tuesday and Wednesday, and shift 1 on Thursday.

Fig. 4
figure 4

Second phase

In addition to variables x and \(\epsilon\) which are common to both rostering problems, the weekday shift management problem (second phase) makes use of the following peculiar variable:

  • Z the maximum difference (in absolute value) of the number of morning and afternoon shifts among the physicians.

The rostering problem addressed in the second phase is the following:

$$\begin{aligned} \min \quad&\alpha _z Z + \alpha _x\sum _{l=1}^{|D|-1} \sum _{(i,j) \in U} x_{iljl+1}^h +\alpha _\epsilon \sum _{h \in H}\sum _{v \in \{0,1\}}\sum _{(i,l) \in \Delta ^h_{vP}}\epsilon _{il}^h \end{aligned}$$
(19)

(1) –(10)

$$\begin{aligned}&(1)--(10) \nonumber \\&\sum _{l\in \overline{D}}\sum _{i\in S}\sum _{j\in S\cup d^h}w_i x_{iljl+1}^h \le \overline{m}^h&\forall h \in H \end{aligned}$$
(20)
$$\begin{aligned}&\sum _{l=d}^{\text {min}(d+6, |D|)}\sum _{i\in S}\sum _{j\in S\cup d^h}w_i x_{iljl+1}^h \le \overline{w}^h&\forall h \in H, \forall d \in M \end{aligned}$$
(21)
$$\begin{aligned}&\sum _{l\in \overline{D}}\sum _{i\in S_s}\sum _{j\in S\cup d^h}x_{iljl+1 }^{h} \le \overline{s}^h&\forall h \in H\end{aligned}$$
(22)
$$\begin{aligned}&\sum _{l\in \overline{D}}\sum _{i\in S}\sum _{j\in S\cup d^h}w_i x_{iljl+1}^h = \overline{b}^h&\forall h \in \overline{H}\end{aligned}$$
(23)
$$\begin{aligned}&\sum _{l\in \overline{D}}\sum _{i\in S_n}\sum _{j\in S\cup d^h}x_{iljl+1}^h \le \overline{n}^h&\forall h \in H \end{aligned}$$
(24)
$$\begin{aligned}&\sum _{l=d}^{l=d+\underline{d}}\sum _{i\in S_n}\sum _{j\in S\cup d^h}x_{iljl+1}^h \le 1&\forall h \in H, \forall d \in D\end{aligned}$$
(25)
$$\begin{aligned}&\sum _{l\in \overline{D}}\sum _{i\in S_a}\sum _{j\in S\cup d^h}x_{iljl+1}^h - \sum _{l\in \overline{D}}\sum _{i\in S_m}\sum _{j\in S\cup d^h}x_{iljl+1}^h \le Z&\forall h \in H\end{aligned}$$
(26)
$$\begin{aligned}&- \sum _{l\in \overline{D}}\sum _{i\in S_a}\sum _{j\in S\cup d^h}x_{iljl+1}^h + \sum _{l\in \overline{D}}\sum _{i\in S_m}\sum _{j\in S\cup d^h}x_{iljl+1}^h \le Z&\forall h \in H \end{aligned}$$
(27)

The balancing criterion used in the second phase guides the search towards solutions in which each physician is assigned almost the same number of morning and afternoon shifts. Specifically, for each physician the (absolute) difference between morning and afternoon shifts assigned in the planning horizon is computed (see 26 and 27) and the objective minimizes the maximum of these differences among the physicians. This is the leading criterion in the second phase. Then, as in the first phase, the other two terms manage respectively undesirable sequences of activities and unsatisfied desiderata. In addition to the set of constraints common to the two phases (constraints 110), the model is characterized by constraints peculiar to the second phase. Specifically, the rest of the constraints guarantee, for each physician, respectively a correct value for the monthly and weekly workloads (see 20 and 21), for the number of night-call duty shifts (22), for the working time of part-time physicians (23), for the number of night shifts (24), and the correct alternation between night shifts and other shifts (25). Specifically, constraints (25) impose that for each time period of \(\underline{d}+1\) consecutive days in the planning horizon, at most one night shift can be assigned to a physician. The time period slides over the entire planning period.

3.4 The enhanced model

In this section, we describe the valid inequalities added to the base model in the first phase to tight its linear programming relaxation.

3.4.1 Consecutive weekends on duty

In the base model, constraints (15) guarantee that the same physician h cannot work two consecutive weekends. Here, we introduce an additional set of constraints which further link the variables x relative to working activities in the weekends to the variable y. Specifically, the following sets of constraints are added:

$$\begin{aligned}&x^h_{iljl+1} \le y^h_w&\forall i\in S\setminus S_r, \forall j\in S, \forall l \in w, \forall w \in W, \forall h \in H \end{aligned}$$
(28)
$$\begin{aligned}&y^h_w \le x^h_{iljl+1}&\forall i\in S_r, \forall j\in S_r, \forall l \in w+1 \text { s.t. } l+1 \in w+1, \forall w = 1, \ldots , |W|-1, \forall h \in H \end{aligned}$$
(29)

Constraints (28) ensure that if physician h is not on duty in weekend w, i.e., \(y^h_w=0\), then no activity i different from a rest shift can be assigned to them in any day of the weekend w. Then, constraints (29) impose that if physician h is on duty on weekend w, i.e., \(y^h_w=1\), then that physician is assigned to a rest shift on every day of the next weekend, i.e., \(w+1\). In these constraints, the constraint quantifier \(\forall l \in w+1 \text { s.t. } l+1 \in w+1,\) expresses the fact that both layer l and \(l+1\) must belong to weekend \(w+1\).

3.4.2 Linking constraints—weekends and nights on duty

In the first phase, as a consequence of constraints ruling consecutive weekends on duty (constraints 15) and the maximum number of night shifts on weekends (constraints 16), it is possible to add linking constraints between variables Y and V. For the reader’s convenience, we recall that they represent respectively the maximum number of weekends on duty assigned to a single physician (Y), and the maximum number of night shifts assigned to a single physician (V). Specifically, the new constraints are the following:

$$\begin{aligned}&Y \le \frac{|W|}{2} +1&\end{aligned}$$
(30)
$$\begin{aligned}&V \le Y&\end{aligned}$$
(31)

Trivially, constraint (30) states that the number of weekends on duty assigned to any physician cannot exceed the rounded-up half of the weekends in the planning period, while constraint (31) assures that for every physician the number of night shifts assigned in the weekends cannot exceed the maximum number of weekends worked in the planning period by physicians.

4 Numerical results

The numerical analysis presented hereafter aims at investigating: (1) the computational performance of the model(s) and (2) the quality of the model solutions. The investigated setting is the Emergency Department (ED) of a leading Italian Children’s hospital, that for confidentiality reasons will be referred to as Alpha. Alpha’s ED is organized in 10 shifts per day (see Table 1). Some of these shifts (MO, AF, ENI, NI, NC) refer to the main ED while the others (MO_O, MO_T, AF_T, AF_O, NC_T) refer to external units (observational and trauma units) closely linked with the ED. These external units are assigned dedicated physicians and nurses during the day, while at night they still have dedicated nurses but share physicians with the main ED.

Table 1 Shift duration and coverage—length is expressed in hours (h)

At Alpha, not all physicians can cover all shifts. The physicians who can cover all shifts are referred to as flexible (flex) and represent approximately \(20\%\) of the staff (5 out of 27).

Not-flexible physicians may have one or more restrictions concerning the shifts they can or cannot cover, the day when they can work (weekdays and/or weekends), and the total amount of hours they can be on duty. While the majority of the not-flexible physicians are expected to work the same amount of time as the flexible one, there are (two) part-time physicians working daytime shifts only and 6 hours per week.

Each shift is characterized by a length and coverage. The coverage (i.e., parameter \(c_{il}\) defined in Sect. 3.1) represents the number of physicians that must be on duty for each shift. In general, for any given shift, the coverage can change between weekends and weekdays. The weekend shifts include the Friday afternoon shift.

As pointed out in the literature review, such heterogeneity is shared by many EDs, as such, we decided to stick to a realistic instance in our analysis instead of creating stylized and easier to manage ones.

The numerical analysis has been performed on a PC equipped with an Intel i7 3.43 GHz processor and 32 Gb of RAM. The models have been coded using Python-Pyomo and solved using the IBM ILOG Cplex 12.6 solver.

4.1 Computational performance evaluation

In this section, we compare the base and enhanced version of the models in terms of computational performances. Specifically, we consider the time required to solve the model (CompTime) and the optimality gap (Gap) associated with the models’ solution, both in the first and the second optimization phases.

Also, we assess how these performances are influenced by (1) the length of the planning horizon, (2) the number and type of shifts that are preassigned to physicians.

4.1.1 Impact of the length of the planning horizon on the computational performance

Table 2 shows the computational time (Time) and optimality gap (Gap) associated with the solution found in correspondence of different planning horizons (1, 3, 6 months). As can be noticed, the enhanced model outperforms the base one as the time horizon increases. With a realistic planning horizon (6 months) the enhanced model is still able to find a decent solution for the first phase (Gap = 8%) within the fixed time limit of 6 hours, while the base one does not find any solution. The computational complexity of the model in the second phase is very limited. The computational time (which is the sum of the times required to solve all the monthly schedules—1, 3, 6 depending on the planning horizon) even in the worst case is negligible if compared with the first phase. It is worth observing that base and enhanced models are identical in the second phase, nonetheless, the computational time in this phase can differ across models as the solutions returned in the first phase are different.

Table 2 Computational performance evaluation: computational time and optimality gap (in the second phase, time is the sum of the times required in each month)—planning horizon length is expressed in months (m), time in seconds (s), gap in percentage (%)

4.1.2 Impact of preassigned shifts on the computational performance

As pointed out in Sect. 3, the optimization models allow considering the (realistic) situations where physicians ask, in advance, to work or not to work certain shifts. These requests may be formulated in prescriptive terms or as desiderata. The formers are referred to as hard requests and modeled using hard constraints, the latter’s are named soft requests and modeled using soft constraints. We randomlyFootnote 1 generated 13 different scenarios each corresponding to a different set of requests. Each scenario is characterized by the following quantities:

  • H1 hard requests to work on a given shift per physician, per month

  • H0 hard requests not to work on a given shift per physician per month

  • S1 soft requests to work on a given shift per physician per month

  • S0 soft requests not to work on a given shift per physician per month.

Table 3 reports for each scenario generated, the model used, the solver’s terminating condition, the optimality gap (Gap %) and the percentage of fulfilled requests (Fulfill%). The computational times and gaps refer to the first phase only (the second phase either does not exist when no feasible solutions are found in the first one, or its computational complexity is negligible compared to the one of the first phase). For all instances, the time limit was set to six hours and the planning horizon to 6 months for the first phase and 1 month for the second one. Scenario 0 represents the situation where no shift is preassigned (same as in Table 2). The other 12 scenarios are paired and within each pair (1–2, 3–4, 5–6, 7–8, 9–10, 11–12) the number of preassigned shifts (H1, H0, S1, S0) is the same but the shifts being preassigned change. H1, H0 range from 0 to 1 while S1, S0 from 0 to 2. For each scenario, we run both the base and the enhanced model.

Table 3 Computational performance evaluation: sensitivity to preassigned shifts

Compared with Scenario 0 from Table 3, emerges that the presence of soft constraints only (Scenarios from 1 to 4) makes the problem more complex to solve for the enhanced model and impossible to solve within the time limit for the base one. For the enhanced model the optimality gaps of scenarios 1–4 (\(\ge 16.94\%\)) are larger than the one of the scenario 0 (approx. \(8\%\)), and the solutions fulfill, approximately, half of the physicians’ soft requests.

The presence of hard requests only (scenarios 5 to 10), instead, makes the model easier to solve, especially when \(H1>0\) (scenario 5–6, 9–10). In fact, this type of requests assign a specific shift to a specific physician, and the corresponding layered graph consequently shrinks. When \(H1>0\), the gaps reduce significantly for the enhanced model and the base model finds feasible solutions for certain scenarios (5, 6 and 10). When both soft and hard constraints are present (scenario 11 and 12) the negative effect on the computational time of the soft constraints may (scenario 11) or may not (scenario 12) be compensated by the positive effect of the hard (H1) ones.

Obviously, the percentage of hard requests accommodated is \(100\%\) for all the feasible solutions. In sum, the analysis of the computational performance of the models demonstrates that the enhanced one outperforms the base one and that this latter model, for realistic instances, characterized by long planning horizon and soft constraints, it is not able to find feasible solutions within reasonable time limits. For these reasons the analysis presented in the next section refers to the enhanced model only.

4.2 Analysis of solution quality

In this section, we assess the quality of the solution returned by the (enhanced) model. To do so we use six Key Performance Indicators (KPIs) which depend on the physician (h): Weekend Ratio (WER\(^h\)), Weekend Nights Ratio (WNR\(^h\)), Nights Ratio (NR\(^h\)), Average Monthly Morning-Afternoon Imbalance Ratio (AMMAIR\(^h\)), Morning-Afternoon Imbalance Ratio (MAIR\(^h\)), and Workload Ratio (WLR\(^h\)). These KPIs are introduced here in an informal way and formally described in “Appendix A”.

  • WER\(^h\) = weekends on duty/max number of weekends workable

  • WNR\(^h\) = nights on duty on the weekends/max number of nights workable on the weekends

  • AMMAIR\(^h\) = \(\frac{1}{6} \sum _{m=1}^{6} \frac{\left| \text {MS in month m -- AS in month m}\right| }{\text {(MS in month m + AS in month m)}}\), where MS and AS indicate respectively morning shifts on duty and afternoon shifts on duty

  • NR\(^h\) = nights on duty in the planning period/max number of nights workable

  • MAIR\(^h\) = \(\frac{\left| \text {total MS in the planning horizon -- total AS in the planning horizon} \right| }{ \text {(total MS in the planning horizon + total AS in the planning horizon)}}\)

  • WLR\(^h\) = hours on duty in the planning horizon/ max number of hours workable.

The maximum number of weekends, nights and hours workable (included in the WER\(^h\), WNR\(^h\), NR\(^h\) and WLR\(^h\) formulas), are calculated taking into account: (1) the individual employment contract of each physician (for example some physicians, by contract, work a limited number of weekdays, hours or shifts) and (2) the general rules that apply to all of them (e.g. the obligation to rest at least three nights between two consecutive night shifts, etc., see constraints in Sect. 3).

All the KPIs are normalized to adjust values measured on different scales to a common scale ranging from 0 to 1. The normalization is needed to compare performance across physicians with different employment contracts and consequently heterogeneous availabilities. In general, the closer the KPI to 0 the more desirable the solution for the physician. However, as the main target of our approach is to produce equitable solutions, the quality of the solution returned by the models is assessed looking at the variability of the KPIs across physicians. The lower such a variability the better the solution.

The first three KPIs (WER\(^h\), WNR\(^h\), AMMAIR\(^h\)) are directly linked with the model objective function(s) (11, 19). In the first phase, the model objective function aims at hierarchically balancing the number of worked weekends and the number of nights worked on the weekends, across physicians, over a 6-month planning horizon. The extent of this balancing can be assessed using WER\(^h\) and WNR\(^h\). In the second phase, instead, the model’s objective function aims at balancing the morning and afternoon shifts worked by each operator every month. Such balancing can be assessed by looking at the AMMAIR\(^h\). This latter KPI represents the average, calculated over 6 months, of the monthly values of MAIR\(^h\).

The last three KPIs (NR\(^h\), WLR\(^h\), and MAIR\(^h\)) are not directly linked with the model objective function. Yet, these KPIs are very useful to assess the overall desirability of the solutions returned by the model. NR\(^h\) measures night shift balancing across physicians over the entire planning period. Contrary to WNR\(^h\), NR\(^h\) does not consider only the weekend nights but all the nights in the planning horizon. WLR\(^h\), instead allows measuring the overall workload balancing across physicians over the entire planning period. Finally, MAIR\(^h\) allows measuring the overall morning-afternoon shift imbalance. Contrary to AMMAIR\(^h\), MAIR\(^h\) does not measure how well morning and afternoon shifts are balanced on a monthly basis, but the overall balance over the entire planning period. It is worth observing that, for a given physician, if AMMAIR\(^h\) is low then MAIR\(^h\) is low too, but not the other way around. For example, a physician working 10 mornings and 0 afternoons for each the first 3 months of the planning horizon and 0 mornings and 10 afternoons for each of the following 3 months, will have an AMMAIR\(^h\) equal to 1 and a MAIR\(^h\) equal to 0.

In the remainder of this section, the variability of the KPIs is assessed looking at the boxplot of the KPIs. The boxplots represent, for each scenario (x-axis), the distribution of the KPIs across physicians. The tables with all the descriptive statistics for each KPI and scenario are reported in “Appendix B”.

4.2.1 Weekend balancing

Figure 5 shows the boxplots relevant to WER\(^h\). Each boxplot refers to one of the scenarios reported in Table 3. As can be noticed, for most of the scenarios the interquartile range (IQR) is zero—the boxplots collapse into a horizontal segment representing the median value (ME) as well as the first (Q1) and third (Q3) quartile—and for all of them it is smaller than 0.05. This means that \(50\%\) of the physicians work approximately the same number of weekends. From the boxplots, we can notice that outliers (the dots in the boxplots) are a limited number and are spread over a limited region. In the worst-case scenario, i.e., scenario 7, the overall range of variation (max(WER\(^h\))-min(WER\(^h\))) is equal to \(29\%\) meaning that the operator working the highest number of weekends, in relative terms, works \(29\%\) weekends more than the one working the smallest number. In absolute terms, this difference translates in approximately 5 weekends in 6 months, which is definitely acceptable.

The number of physicians included in the calculation of WER\(^h\) is 26 instead of 27. This is because one physician, by contract, does not work on weekends.

Fig. 5
figure 5

Boxplots of WER

4.2.2 Night balancing

To assess whether the model succeeds in balancing the (relative number of) nights worked by each physician it is necessary to look at the WNR\(^h\) and NR\(^h\) KPIs. Looking at Fig. 6 we can notice that, when no shift is preassigned (Scenario 0) the number of nights worked in the weekends (WNR\(^h\), black boxplots) is very well balanced across physicians, the IQR is zero and the outliers are very close (and smaller than) the median value. In case of hard or soft requests, such a balancing may (scenarios 1, 2, 3, 4, 7) or may not (scenario 10) significantly worsen. This is because the scenarios where the variability of WNR\(^h\) is higher (e.g. scenario 1, 2, 3, 4, 7) are those characterized by the largest optimality gap (see Table 3). Since the balancing of the worked weekend nights is an objective that is hierarchically subordinated to the balancing of the worked weekends, it comes at no surprise that solutions that are not optimal are characterized by a large value of WNR\(^h\) (even when the variability of WER\(^h\) is low).

Fig. 6
figure 6

Boxplot of WNR and NR

However, if we consider the nights worked on the whole planning horizon (NR\(^h\), grey boxplots), we can notice that the overall balancing is indeed good. In fact, the IQR, as well as the overall range of variation (Range) of NR\(^h\), is limited for all the scenarios (\(IQR\le 0.10\) and Range \(\le 0.26\) across scenarios). This is because of the constraint that bounds the maximum number of nights that physicians can work (24). Such a constraint prevents physicians that worked a high number of nights in the weekends (first phase) to be assigned a high number of night shift on weekdays.

4.2.3 Morning-afternoon balancing

To assess whether physicians are assigned with a balanced number of morning and afternoon shifts, it is necessary to look at the AMMAIR\(^h\) and MAIR\(^h\). The lower the variability of these KPIs the smaller the (relative and absolute) difference between morning and afternoon shifts worked across physicians.

Looking at AMMAIR\(^h\) (black boxplots in Fig. 7 and Table 10 in “Appendix B”) we can notice that for the scenario with no preassigned shifts (scenario 0) the variability of AMMAIR\(^h\) is very limited (IQR \(<0.06\), SD \(<0.5\)) and the median and mean values are very low (\(M=0.13\), \(Q2=0.09\)). It implies that all the physicians, every month, are assigned with a very well balanced number of morning and afternoon shifts. For all other scenarios, this imbalance is quite limited and equally distributed, except for some (one or two) physicians whose AMMAIR\(^h\), in certain scenarios (scenarios 1, 2, 3, 4, 11, 12) is equal to 1. These outliers are representative of a total imbalance, i.e., they refer to physicians, that, each month, work either morning or afternoon shifts. A closer look at the data, however, reveals that the physicians corresponding to the boxplots’ outliers work less than 2 shifts per month (across scenarios). This implies that a very limited absolute imbalance between the number of afternoon and morning shifts scheduled (e.g. 1 or 2 shifts per month) translates into a large value of AMMAIR\(^h\) (e.g. from 0.5 up to 1), even if this imbalance does not significantly affect the value of the objective function. If we look at MAIR\(^h\) (grey boxplots in Figure 7 and Table 11 in “Appendix B”), in fact, we can notice that, when considering the total number of afternoon and morning shifts worked in the planning horizon, regardless to the considered scenario, the median value of the imbalance significantly decreases, and the variability across scenarios decreases as well. This means that the physicians working one or two shifts per month, alternate the type of shift worked from 1 month to another (e.g. they may work two morning shifts 1 month, and two afternoon shifts the following one).

Fig. 7
figure 7

Boxplots of AMMAIR and MAIR

Hence, if we consider the whole planning horizon (i.e. MAIR\(^h\)) the solution returned by the model is both desirable (M \(<0.15\), ME \(<0.04\) across scenarios) and equitable across physicians (IQR \(<0.10\), SD \(<0.9\) across scenarios).

4.2.4 Overall workload balancing

To assess whether the overall workload is well balanced across physicians, we can look at the boxplots of WLR in Fig. 8 (and Table 12 in “Appendix B”).

Fig. 8
figure 8

Boxplots of WLR

From the boxplots emerges that the model obtains decently balanced solutions (IQR ranges between 0.08 and 0.13 across scenarios). Half of the physicians work between 80 and 94% of the total time they could work by contract. Yet, the overall range of variation for this KPI is quite large (Range \(> 0.37\) across scenarios) with values that are well below the median value. This is due to two facts. First and foremost, the workload balancing is not included in the model’s objective function, therefore it is a (desirable) side-effect of the balancing of weekends, nights, and morning-afternoon shifts. Secondly, the physicians considered in the calculation of this KPI (the whole population composed by 27 physicians) are heterogeneous in terms of number and types of shifts and hours they can work.

In fact, if we split the population between flexible (black boxplots in Fig. 9 and Table 13 in “Appendix B”) and not-flexible physicians (grey boxplots in Fig. 9 and Table 14 13 in “Appendix B”) we can notice that the workload balancing is excellent across scenarios for the flexible physicians. All of them work between 93% and 98% of their available time (across scenarios).

Fig. 9
figure 9

Boxplots of WLR for flexible (WLR_f) and not-flexible (WLR_nf) pysicians

While flexible physicians can cover all shifts, the not-flexible ones can cover only a limited subset. Leveling the utilization of these latter physicians is therefore harder. Compared with the flexible ones, not-flexible operators are characterized by a smaller median value of WLR\(^h\) and by a significantly higher variability of this KPI. A closer look at the data reveals that the outliers in all the (grey) boxplots correspond, across scenarios, to the same physicians. These physicians do not work on weekends or work at nights or on Sundays only.

5 Application of the models to benchmark instances from the literature

To generalize our findings and to demonstrate the applicability of our model in settings other than the Alpha hospital, in this section we report the computational results obtained on a set of benchmark instances from the literature (Wickert et al. 2020). Specifically, in Sect. 5.1 we describe the structure of the instances highlighting similarities and differences with the Alpha ones, while in Sect. 5.2 we report and discuss the computational results in terms of computational efficiency and solution quality.

5.1 Setting description

Instances in (Wickert et al. 2020) have been generated according to the general and realistic structure defined in the International Nurse Rostering competition (Ceschia et al. 2019). We briefly report their main features: (1) 4-week planning horizon; (2) 5 weekdays (from Monday to Friday) and 2 weekend days (Saturday and Sunday); (3) 3 skills each corresponding to a different unit (Inpatient Units); (4) 3 types of shifts (Early, Late and Night shift); (5) 2 types of contracts (Full time or Half time each characterized by a minimum and maximum number of workable shifts). Shifts in our models correspond to all the combinations of skills and shift types above mentioned for a total of 9 work shifts. According to the notation used in Wickert et al. (2020), we consider the hard (H) and soft (S) constraints listed below. For each one, we report in parentheses the way we model them.

  • H1 A physician can be assigned to at most one shift per day during weekdays (graph construction and constraints (1, 2, 3)

  • H2 Minimum number of physicians per day/shift/location (new constraints 32)

  • H3 Maximum number of physicians per day/shift/location (new constraints 32)

  • H4 A physician must be assigned to both Early and Late shifts, or a Night shift, or have a day off on weekend days (graph construction)

  • H5 Invalid shift succession (graph construction)

  • H6 A physician can be unavailable for some shifts or days (replaced by soft constraints)

  • H7 On weekend days a physician can work both Early and Late shifts in the same day. When this happens, physicians must work the two shifts at the same unit (graph construction)

  • H8 A physician must be qualified to work at specific locations (graph construction)

  • S3 Undesired working day or shift (soft constraints)

  • S6 Minimum and maximum number of assignments over the planning horizon according to the working contract (soft constraints).

Constraints (4) in the models are replaced by constraints (32) to reflect H2 and H3 with \(\underline{c}_{il}\) and \(\overline{c}_{il}\) defining respectively the minimum and the maximum number of physicians per day/shift.

$$\begin{aligned}&\underline{c}_{il} \le \sum _{h \in H}\sum _{j \in S \cup o^h}x_{jl-1 il}^{h} \le \overline{c}_{il}&\forall l \in \overline{D}, \forall i \in S. \end{aligned}$$
(32)

The purpose of this section is not to compare our approach with the one proposed in Wickert et al. (2020). Rather, the purpose is to generalize the results discussed in Sects. 4.1 and 4.2 by extending the experimental tests. For this reason, on this new set of instances, we have retained both the two-phase approach and the respective objective functions and have not considered additional soft constraints from Wickert et al. (2020) with the exception of the soft constraint relative to workable number of shifts over the planning horizon.

Of the instances reported in Wickert et al. (2020) we have considered only those with 50 physicians (10 in total). This is because the authors show that for the larger instances (characterized, respectively, by 100 and 150 doctors) in their dataset Cplex was not able to find a solution in a reasonable amount of time.

The benchmark instances differ from the Alpha ones in terms of: (1) number and type of physicians; (2) the set of feasibility constraints; (3) number of (hard and soft) requests.

With respect to the number and type of physicians, the benchmark instances are characterized by 50 fully flexible physicians (i.e., all of them can work all shifts), while in the Alpha’s ones there are 27 physicians that can be either flexible or not-flexible. In both instance sets, there are full-time and part-time operators and the number of shifts to cover daily, is more or less the same (9 vs. 10). In the benchmark instances, operators are characterized by a minimum and maximum number of shifts to work monthly depending on their typology (Full vs Half time). However, guaranteeing these constraints might make it impossible to ensure adequate coverage for each shift. As a consequence, the constraints binding the number of shifts worked by each physician have been modeled as soft constraints. Specifically, for each physician, we define two variables, \(\theta ^h\) and \(\eta ^h\), representing respectively the number of shifts exceeding the maximum allowed number of monthly assignments and the slack from the minimum. Then, the sum over all the physicians of these surplus and slack variables is minimized in the objective function of the second phase with a weight smaller than the ones used for the other criteria.

With respect to the feasibility constraints, benchmarks instances refer to a problem less constrained than the one addressed at Alpha: constraints (15, 16, 17, 24, 25) are not present here. In addition, the constraints limiting the maximum weekly and monthly workload considered in our model (constraints 20 and 21) are replaced by the above mentioned soft constraints binding the number of shifts worked monthly. Other types of constraints controlling the quality of the solution, such as those related to the consecutiveness of the same shit type, are managed via soft constraints in the benchmark instances.

With respect to the number of requests, the benchmark instances are characterized by a way larger number of hard and soft requests. In addition, these requests are all of type 0 (not to work). As for the hard ones, in the original input les about five physicians, every day ask not to do any of the nine shifts (equivalently they ask for rest), while the number of soft requests varies between 262 and 344. A description of the instances in terms of physician typology and requests is given in Table 4.

Table 4 Instances from the literature, characteristics. E, L, and N stand respectively for early, late and night shifts

Specifically, Table 4, in the Requests columns, reports the number of hard and soft requests expressed by physicians in the original instances and the number of constraints they generate in our instances. Specifically, every week, overall, physicians express 35 hard requests (\(140=35 \times 4\)) not to work any of the 9 shifts (column Any H0). These requests give rise to 140 \(\times\) 9 preassigned shifts in our models. In addition, columns Early S0, Late S0, and Night S0 report the number of soft requests in the original instances not to work a early, late or night shift respectively. Each of these requests, gives rise to 3 (one for each Inpatient Unit) soft requests in our model. The total number of requests to address in our models is given in column Tot. A preliminary analysis showed that some of the requests appeared in the inputs as both hard and soft; the total number of filtered requests to insert after deletion of duplicates is reported in column FTot. In addition, Table 4 reports, for each instance, also the mix of physicians in terms of Full and Half time staff. Preliminary computational results showed that the high number of hard requests, along with the stringent coverage constraints, compromised problem feasibility. For this reason, we transformed all hard-type requests to soft and a-posteriori observed the fulfillment rate (Fulfill%). Regarding the coverage, every working day the number of shifts to cover is between 5 and 6, while for the weekend days this number is between 4 and 5 (coefficients \(\underline{c}_{il}\) and \(\overline{c}_{il}\) in Constraints 32).

5.2 Results and discussion

As for the Alpha instances, the numerical analysis presented hereafter investigates: (1) the computational performance of the model(s) (Table 5) and (2) the quality of the model solutions (6). For the first phase, the model used is the basic one and for both phases the time limit is set to 2 h (7200 s).

Table 5 Instances from the literature, computational efficiency

Table 5 reports separately for each phase and for each instance, the value of the objective function components, the computational time expressed in seconds (Time) and the optimality gap (Gap%). For the objective function, in the first phase, the maximum number of weekends worked (max Y), the maximum number of nights worked on weekends (max V), the penalty due to non-satisfaction of requests, i.e., the number of unsatisfied requests (\(\sum \epsilon\)) are reported. In the second phase instead, the components of the objective function concern the maximum imbalance between early and late shifts (max Z), the number of unsatisfied requests (\(\sum \epsilon\)), the penalty resulting from not satisfying the maximum number of shifts allowed per month per physician, i.e., the total number of extra shifts (\(\sum \theta\)), and finally the penalty resulting from not satisfying the minimum number of shifts allowed per month per physician (\(\sum \eta\)). Looking at Table 5 it is possible to observe that: (1) the first phase is solved to optimality very efficiently (within 0.21 seconds in the worst case) and all the components of the objective function have, as desired, very small values; (2) the second phase is much more time consuming than the first one and in fact, the time limit is always reached. In addition, the satisfaction of cover constraints imposes that a consistent number of extra shifts is assigned (see column \(\sum \theta\)). The optimality gaps are not satisfactory in the second phase. However, observing the value of the main component of the objective function (max Z), we observe that early and late shifts are very well balanced among physicians. These observations seem to suggest that it is the bound, rather than the solution, to be of poor quality.

Table 6 Instances from the literature, solution quality

Table 6 reports for each of the 10 instances the percentage of fulfilled requests (Fulfill%) and 5 KPIs (average value M and standard deviation SD): WER and WNR which refer to the solution quality in the weekends, and WLR, NR, and MAIR describing the quality of the overall solution (first and second phase). For all the physicians the maximum number of weekends workable is set to 4. the maximum number of nights workable on the weekends is set to 8 (2 in each weekend). and the maximum number of nights workable is set to the planning horizon length (30). From Table 6 emerges that: (1) the number of satisfied requests is very high (in the worst case the not granted requests are only 3.16% of the total); (2) WER and WNR are quite low, well balanced across physicians (very low standard deviations) and stable across the 10 instances; (3) the high number of shifts to be covered implies a high workload (WLR greater than 1), but still well balanced across physicians; (4) covering constraints impose also a high number of night shifts per physician (about 10), however, differently from WLR, night shifts are not well balanced among physicians as the standard deviation clearly shows; (5) the balance between early and late shifts is of excellent quality. The only unsatisfactory KPI is therefore NR: in future research, it might be interesting to include in the second phase the constraints of night balancing currently used in the first phase or to provide other mechanisms for equitable distribution of these shifts.

6 Conclusions

In this study, we have proposed a two-phase approach to the EDPRP and developed two network-based optimization models that allow implementing the proposed approach in a real setting. The models have been tested both on real data from a leading European Hospital and on benchmark instances from the literature. Their effectiveness has been shown using six KPIs purposely defined. The results presented in Sect. 4, demonstrate that the proposed models allow for: (1) taking into consideration the plurality of features characterizing the physician rostering problem (heterogeneity of work contract, presence of hard and soft requests); (2) obtaining equitable schedules, by evenly balancing undesirable shifts across operators both in the long and in the medium term; (3) letting physicians know, well in advance, the weekends when they will be expected to be on duty, and (4) accommodating the physicians’ hard and (most of the) soft requests. This latter type of requests, however, makes the problem significantly more complex to solve. In fact, to find feasible solutions in a reasonable amount of time, we needed to introduce a set of valid inequalities thus strengthening the linear programming relaxations of the base models. Yet, in certain scenarios coming from the Alpha setting, the presence of soft constraints prevented us from finding optimal solutions within the fixed time limit. This means that an excessive number of this type of requests may hamper the quality of the solution even if they are not fulfilled.