1 Introduction

Data envelopment analysis (DEA), introduced by Charnes et al. (1978), is an approach for identifying best practices of peer decision making units (DMUs), in the presence of multiple inputs and outputs. This technique makes no assumptions on the production function and imposes no subjective weights on multiple inputs and outputs. DEA has gained considerable attention in various areas and obtained rich theoretical developments. Although DEA can evaluate the relative efficiency of a set of DMUs, it cannot identify the sources of inefficiency in the DMUs because conventional DEA models view DMUs as “black boxes” that consume a set of inputs to produce a set of outputs.

To identify the sources of inefficiency in the DMUs, Fare and Grosskopf (2000) firstly proposed the idea of network. Many papers on network DEA were published, mainly dealing with two-stage systems (e.g. Kao and Hwang 2008; Liang et al. 2008), parallel systems (e.g. Kao 2012; Gong et al. 2018), general multistage processes (e.g. Cook and Hababou 2001; Kao 2014), dynamic network processes (e.g. Tone and Tsutsui 2014; Zarei et al. 2018) and so on.

A parallel system is composed of a specific number of processes where each process performs a specific function or produces a specific product (Kao 2009). For example, a university has two major missions-teaching and research, which can be deemed as two parallel sub-systems—teaching department and research department. These two departments use certain full-time teachers and operating expenses to complete their respective functions, and they also use some shared inputs, such as teachers, library staff, laboratory staff and school welfare officer (Beasley 1995); In bank’s operations, sales department and service department could deemed as two parallel sub-systems which have their own inputs and also shared inputs, such as full time equivalent support staff and full time equivalent “other” staff (Cook and Hababou 2001).

Many studies are proposed to measure the efficiencies of parallel systems. The literatures on the parallel efficiency can be clarified into three categories: network approach (e.g. Fare and Grosskopf 2000; Castelli et al. 2004), multi-components approach (e.g. Cook et al. 2000; Jahanshahloo et al. 2004), and relational DEA approach (e.g. Kao 2009). The network approach assumes each sub-system works independently and each sub-system has its own set of inputs’ weights/outputs’ weights. The multi-components approach assumes the system efficiency is the weighted sum of process efficiencies and each process’s inputs’ weights/outputs’ weights are different. Kao (2009) found that the system efficiency is a weighted average of the process efficiencies by developing a relational DEA model. And the weight attached to each process is the proportion of the aggregate input consumed by each process in total aggregate input consumed by all processes.

In addition, the conventional DEA models require the accurate value of the inputs and outputs. However, observations are usually difficult to measure precisely and are often imprecise, for example, research quality can’t be expressed precisely (Beasley 1995; Kao and Lin 2011). Many methods have been proposed to deal with the imprecise data (e.g. Emrouznejad et al. 2014; Emrouznejad and Mustafa 2010). On the other hand, papers related to applications also emphasize these problems, e.g. Wanke et al. (2016).

To calculate the efficiency of parallel systems when the data is imprecise, Kao and Lin (2012) constructed a pair of two-level programming model to calculate the lower and upper bounds of the \( \alpha \)-cuts of the fuzzy system and process efficiencies based on Bellman and Zadeh (1970)’s extension principle. Lozano (2014) extends Kao and Lin’s (2012) method and considers the variability of the process efficiencies compatible with a given value of the system efficiency by proposing a new method of computing the process efficiencies. These two papers on fuzzy efficiency of parallel systems deem the processes are of equally important.

In reality, some processes in parallel systems are not of equally important, for example, there are usually two functional departments in colleges or universities–teaching and research, which uses general expenditure and equipment expenditure and teachers as inputs. In reality, this kind of parallel systems usually has the feature that one process dominates the other. For example, in research-oriented universities, teaching is always better than research and is in the leading position; and in teaching-oriented universities, research is always better than teaching and is in the leading position. Thus, in the efficiency evaluation of such systems, the leader and follower relationship should be considered (Liang et al. 2008). In the existing literature, some scholars have proposed to model this kind of leader and follower relationship by introducing Stackelberg game theory in the efficiency evaluation. For example, Liang et al. (2008) deemed the manufacturer as the leader and the retailer as a follower in a two-stage supply chain, and first introduced modeling two-stage network DMUs from the perspective of noncooperative perspective to model the leader–follower relationship in the efficiency evaluation. Zha and Liang (2010) extended the two-stage noncooperative approach to the case with shared inputs which can be freely allocated among different stages. Tavana and Khalili-Damghani (2014) proposed a new two-stage Stackelberg fuzzy data envelopment analysis model by considering the fuzzy data in the two-stage network production process. The objective of the paper is to calculate the system and process efficiency for parallel systems in the fuzzy environment based on Stackelberg game theory. Especially, the leader–follower relationship between the process efficiencies will be investigated. A case of 52 chemistry department in UK universities in Beasley (1995) is used to illustrate the approach.

This paper is organized as follows. In Sect. 2, calculation of the system and process efficiencies using the conventional parallel DEA model based on Stackelberg game theory is reviewed. Then, a new fuzzy parallel DEA method based on Stackelberg game theory is proposed. In Sect. 4, we present a case study of measuring the teaching and research efficiencies of 52 chemistry departments in UK universities to explain how the proposed model is applied to measure the fuzzy efficiencies of parallel systems with two components based on Stackelberg game theory. Finally, some conclusions are drawn.

2 Conventional parallel DEA model based on Stackelberg game theory

In this section, we briefly review the conventional parallel DEA model based on Stackelberg game theory. In reality, one sub-system in parallel system often dominates the other. In Fig. 1, suppose one sub-system is in the dominant position, and the other is in a subordinate position, then non-cooperative Game theory can be employed in the efficiency evaluation of the system.

Fig. 1
figure 1

The parallel system with two components

In order to model the network processes via concepts adopted from non-cooperative games, Liang et al. (2008) extends two-stage models using game theory concepts. Similarly, the parallel models can also be extended by using game theory concepts. Figure 1 shows the general parallel system. There are two parallel sub-systems. Sub-system 1 applies inputs \( x_{ij}^{1} \) to produce outputs \( y_{rj}^{1} \), and sub-system 2 applies inputs \( x_{ij}^{2} \) to produce outputs \( y_{rj}^{2} \). Denote \( x_{ij}^{\text s} = x_{ij}^{1} + x_{ij}^{2} \) and \( y_{rj}^{\text s} = y_{rj}^{1} + y_{rj}^{2} \) as the system inputs and outputs, respectively.

Based on the concept of non-cooperative game theory, suppose one sub-system–sub-system 1 in Fig. 1 is dominant, and the other sub-system 2 is subordinate. In such case, the dominant sub-system 1 should be considered firstly and its efficiency should be calculated at first, and the efficiency of sub-system 1 should serve as constraints when evaluating the efficiency of the whole system. By using the CRS input-oriented DEA model proposed by Charnes et al. (1978), the following model for calculating the efficiency of sub-system 1 is presented:

$$ \begin{aligned} & E_{k}^{1} = Max\begin{array}{*{20}c} {\frac{{\mathop \sum \nolimits_{r = 1}^{s} u_{r} y_{rk}^{1} }}{{\mathop \sum \nolimits_{i = 1}^{m} v_{i} x_{ik}^{1} }}} \\ \end{array} \\ & s.t.\quad \frac{{\mathop \sum \nolimits_{r = 1}^{s} u_{r} y_{rj}^{1} }}{{\mathop \sum \nolimits_{i = 1}^{m} v_{i} x_{ij}^{1} }} \le 1,\quad j = 1, \ldots ,n \\ & u_{r} \ge \varepsilon , \ldots r = 1, \ldots ,s\quad {\text{and}}\quad v_{i} \ge \varepsilon ,i = 1, \ldots ,m \\ \end{aligned} $$
(1)

Denote the optimal objective function value of Model (1) as \( E_{k}^{1} \), then the possible maximum efficiency score of sub-system 1 is \( E_{k}^{1} \). Model (1) is a fractional programming, but it can be converted into a linear model via the Charnes–Cooper (C–C) transformation.

Once we obtain the efficiency for the first stage, the whole system will maximize its efficiency that maintain \( E_{k}^{1} = E_{k}^{1*} \). in other words, the whole system maximize its efficiency subject to the restriction that the efficiency score of the first stage remains at \( E_{k}^{1*} \). The model for computing \( E_{k}^{O} \), the whole system’s efficiency (Liang et al. 2008), can be expressed as

$$ \begin{aligned} & E_{k}^{O} = Max\frac{{ {\mathop \sum \nolimits_{r = 1}^{s} u_{r} Y_{rk}^{s} } }}{{\mathop \sum \nolimits_{i = 1}^{m} v_{i} X_{ik}^{s} }} \\ & s.t.\quad \frac{{\mathop \sum \nolimits_{r = 1}^{s} u_{r} y_{rj}^{1} }}{{\mathop \sum \nolimits_{i = 1}^{m} v_{i} x_{ij}^{1} }} \le 1, j = 1, \ldots ,n \\ & \quad\quad \quad \frac{{\mathop \sum \nolimits_{r = 1}^{s} u_{r} y_{rj}^{2} }}{{\mathop \sum \nolimits_{i = 1}^{m} v_{i} x_{ij}^{2} }} \le 1, j = 1, \ldots ,n \\ & \quad \quad \quad \frac{{\mathop \sum \nolimits_{r = 1}^{s} u_{r} y_{rk}^{1} }}{{\mathop \sum \nolimits_{i = 1}^{m} v_{i} x_{ik}^{1} }} = E_{k}^{1*} \\ & u_{r} \ge \varepsilon ,r = 1, \ldots ,s \& v_{i} \ge \varepsilon , i = 1, \ldots ,m \\ \end{aligned} $$
(2)

Note that in Model (2), the efficiency of sub-system 1 is set equal to \( E_{k}^{1*} \).Then Model (2) can be transformed to the following linear programming model.

$$ \begin{aligned} & E_{k}^{O} = Max {\mathop \sum \limits_{r = 1}^{s} u_{r} Y_{rk}^{s} } \\ & s.t. \ \, \quad{\mathop \sum \limits_{i = 1}^{m} v_{i} X_{ik}^{s} = 1} \\ & \quad \quad\quad \mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{1} \le 0, j = 1, \ldots ,n \\ & \quad \quad\quad \mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{2} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{2} \le 0, j = 1, \ldots ,n \\ & \quad \quad\quad \mathop \sum \limits_{r = 1}^{s} u_{r} y_{rk}^{1} = E_{k}^{1*} \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ik}^{1} \\ & u_{r} \ge \varepsilon , r = 1 \ldots ,s \& v_{i} \ge \varepsilon , i = 1, \ldots ,m \\ \end{aligned} $$
(3)

The efficiency of the whole system can be obtained by solving Model (3). And, the weights attached to inputs and outputs can also be obtained. Based on these weights, the efficiency of sub-system 2 can be obtained as \( E_{k}^{2*} = \frac{{\sum\nolimits_{r = 1}^{s} {u_{r}^{*} y_{rk}^{2} } }}{{\sum\nolimits_{i = 1}^{m} {v_{i}^{*} x_{ik}^{2} } }} \).

In a similar manner, if we assume the second stage to be the leader, we first calculate the regular DEA efficiency (\( E_{k}^{2} \)) for that stage, using the appropriate CCR model. Then, one maximize the efficiency of the whole system, with the restriction that the second stage score, having already been determined, cannot be decreased from that value.

Although previous literature (Emrouznejad et al. 2014) has proposed many methods to address the problems related to fuzzy DEA (e.g. Emrouznejad and Mustafa 2010; Wanke et al. 2016), few papers address the problem about fuzzy efficiency for parallel systems. Kao and Lin (2012) and Lozano (2014) calculated the efficiency of parallel systems when the data is imprecise and they deem the parallel processes are of equally important. However, parallel processes may not be of equally important, for example, the parallel processes may have leader–follower relationship. To approach this kind of relationship in fuzzy parallel DEA models, we propose a fuzzy parallel DEA method based on Stackelberg game theory in next section.

3 Proposed fuzzy parallel DEA method based on Stackelberg game theory

The parallel process now is composed of fuzzy inputs and fuzzy outputs in each sub-system. In this paper, trapezoidal fuzzy numbers (TrFNs) are used. Let be the universe of discourse, \( X = \left\{ {x_{1} ,x_{2} , \ldots ,x_{n} } \right\} \). A fuzzy set \( \tilde{A} \) of \( X \) is a set of order pairs \( \left\{ {x_{1} ,\mu_{{\tilde{A}}} \left( {x_{1} } \right),x_{2} ,\mu_{{\tilde{A}}} \left( {x_{2} } \right), \ldots ,x_{n} ,\mu_{{\tilde{A}}} \left( {x_{n} } \right)} \right\} \). Here, \( \mu_{{\tilde{A}}} :X \to \left[ {0,1} \right] \) is the membership function of \( \tilde{A} \), and \( \mu_{{\tilde{A}}} \left( {x_{\text{i}} } \right) \) represents the membership degree of \( x_{i} \) in \( \tilde{A} \). The following two definitions specify the fuzzy numbers:

Definition 3.1

(Zimmermann 1996) The \( \alpha \)-cut \( \tilde{A}_{\alpha } \) and strong \( \alpha \)-cut \( \tilde{A}_{\alpha + } \) of fuzzy set \( \tilde{A} \) in the universe of discourse \( X \) is defined by \( \tilde{A}_{\alpha } = \left\{ {x_{i} :\mu_{{\tilde{A}}} \left( {x_{i} } \right) \ge \alpha ,x_{i} \in X} \right\} \), where \( \alpha \in \left[ {0,1} \right] \) and \( \tilde{A}_{\alpha + } = \left\{ {x_{i} :\mu_{{\tilde{A}}} \left( {x_{i} } \right) > \alpha ,x_{i} \in X} \right\} \), where \( \alpha \in \left[ {0,1} \right] \), respectively.

Definition 3.2

(Zimmermann 1996) A trapezoidal fuzzy number (TrFN) can be defined as \( \tilde{x} = \left( {x^{1} ,x^{2} ,x^{3} ,x^{4} } \right) \), where the membership function \( \mu_{{\tilde{m}}} \) of \( \tilde{m} \) is given as follows:

$$ \mu \left( x \right) = \left\{ \begin{array}{ll} \frac{{x - x^{1} }}{{x^{2} - x^{1} }}&\left( {x^{1} \le x \le x^{2} } \right) \hfill \\ 1& \left( {x^{2} \le x \le x^{3} } \right) \hfill \\ \frac{{x^{4} - x}}{{x^{4} - x^{3} }}&\left( {x^{3} \le x \le x^{4} } \right) \hfill \\ \end{array} \right. $$

where the mode interval of \( \tilde{x} \) is \( \left[ {x_{2} ,x_{3} } \right] \), and the lower and upper limits of \( {\tilde{\text{x}}} \) are \( x^{1} \) and \( x^{2} \), respectively. There are many types of fuzzy numbers, for example, triangular and trapezoidal fuzzy numbers. In this paper, TrFNs are considered. Because the trapezoidal fuzzy numbers has the advantage of simplicity in both concept and computation. And they are most often used for characterizing imprecise or ambiguous information in practical applications.

3.1 Fuzzy inputs and outputs of the parallel system

In this study, the single-stage DEA model with fuzzy data (Emrouznejad and Mustafa 2010), the parallel DEA mode (Kao 2009), the game approach of DEA models for two-stage processes (Liang et al. 2008), and the parallel fuzzy DEA Model (Kao and Lin 2012) are extended to develop a new parallel fuzzy data envelopment analysis model for parallel systems with two components based on Stackelberg game theory. Consider the TrFNs in the left and right spread format as inputs and outputs of n DMUs with parallel sub-systems. Each \( DMU_{j} \left( {j = 1, \ldots ,n} \right) \) applies fuzzy inputs \( \tilde{x}_{ij}^{p} = \left( {x_{ij}^{{p_{1} }} ,x_{ij}^{{p_{2} }} ,x_{ij}^{{p_{3} }} ,x_{ij}^{{p_{4} }} } \right) \) (p = 1, 2) to produce outputs \( \tilde{y}_{rj}^{p} = \left( {y_{rj}^{{p_{1} }} ,y_{rj}^{{p_{2} }} ,y_{rj}^{{p_{3} }} ,y_{rj}^{{p_{4} }} } \right) \) in sub-system p.

Using an arbitrary \( \alpha \)-cut for the inputs, outputs, and the lower and upper-bounds of the membership functions are calculated as follows:

$$ \left( {x_{ij}^{p} } \right)_{{\alpha_{i} }}^{L} = x_{ij}^{{p_{1} }} + \alpha_{i} \left( {x_{ij}^{{p_{2} }} - x_{ij}^{{p_{1} }} } \right), \alpha_{i} \in \left[ {0,1} \right], i = 1, \ldots ,m;j = 1, \ldots ,n\quad $$
(4)
$$ \left( {x_{ij}^{p} } \right)_{{\alpha_{i} }}^{U} = x_{ij}^{{p_{4} }} - \alpha_{i} \left( {x_{ij}^{{p_{4} }} - x_{ij}^{{p_{3} }} } \right), \alpha_{i} \in \left[ {0,1} \right], i = 1, \ldots ,m; j = 1, \ldots ,n \ \ $$
(5)
$$ \left( {y_{rj}^{p} } \right)_{{\alpha_{r} }}^{L} = y_{rj}^{{p_{1} }} + \alpha_{r} \left( {y_{rj}^{{p_{2} }} - y_{rj}^{{p_{1} }} } \right), \alpha_{i} \in \left[ {0,1} \right], r = 1, \ldots ,s; j = 1, \ldots ,n \quad$$
(6)
$$ \left( {y_{rj}^{p} } \right)_{{\alpha_{r} }}^{U} = y_{rj}^{{p_{4} }} - \alpha_{r} \left( {y_{rj}^{{p_{4} }} - y_{rj}^{{p_{3} }} } \right), \alpha_{i} \in \left[ {0,1} \right], r = 1, \ldots ,s; j = 1, \ldots ,n. $$
(7)

3.2 Upper-bound of the efficiency values for sub-system 1

When stage 1 is a leader, the efficiency of stage 1 is optimized in priority. Model (9) is proposed to calculate the upper bound of the efficiency values for sub-system 1 for each DMU as follows:

$$ \left( {E_{k}^{1} } \right)_{\alpha }^{U} = \mathop {\hbox{max} }\limits_{{\begin{array}{*{20}c} {\left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{L} \le x_{ij}^{1} \le \left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{U} } \\ {\left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{L} \le y_{rj}^{1} \le \left( {y_{rj}^{1} } \right)_{r}^{U} } \\ \end{array} }} \left[ {\begin{array}{*{20}c} {Max \mathop \sum \limits_{r = 1}^{s} u_{r} y_{rk}^{1} } \\ {s.t. \mathop \sum \limits_{i = 1}^{m} \nu_{i} x_{ik}^{1} = 1} \\ {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} \nu_{i} x_{ij}^{1} \le 0,j = 1, \ldots ,n} \\ {u_{r} \ge \varepsilon ,r = 1, \ldots ,s} \\ {\nu_{i} \ge \varepsilon ,i = 1, \ldots ,m} \\ \end{array} } \right] $$
(8)

For each set of \( x_{ij}^{p} \) and \( y_{rj}^{p} \) values defined by the respective \( \alpha \)-cut in the outer program (first level), the efficiency is calculated in the inner level (second level). These set of \( x_{ij}^{p} \) and \( y_{rj}^{p} \), which produce the largest efficiencies, are determined at the first level by Models (8). Because the feasible region for variables \( x_{ij}^{p} \) and \( y_{rj}^{p} \) is hyper-rectangle, which is a convex and compact set. Thus, \( \left( {E_{k}^{1} } \right)_{\alpha }^{U} \) is continuous with respect to \( x_{ij}^{p} \) and \( y_{rj}^{p} \). \( x_{ij}^{1} \) and \( y_{rj}^{1} \) are the decision variables for the outer-level optimization problem and considered as the constant multipliers for the inner-level optimization problem. Thus, the two-level optimization program (8) can’t be solved in its current form, and should be reduced to a one-level optimization program.

The inner and outer programs of Model (8) have the same direction of optimization, i.e., maximization; they can be reduced to the following one-level optimization program:

$$ \begin{aligned} &\!\left( {E_{k}^{1} } \right)_{\alpha }^{U}= Max {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rk}^{1} } \\ &s.t.\quad {\mathop \sum \limits_{i = 1}^{m} v_{i} x_{ik}^{1} = 1} \\ & \, \quad \quad \ \mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{1} \le 0, j = 1, \ldots ,n \\ & \, \quad \ \quad \left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{L} \le x_{ij}^{1} \le \left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{U} \\ & \,\kern1pt \quad \ \quad \!\left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{L} \le y_{rj}^{1} \le \left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{U} \\ &u_{r} \ge \varepsilon ,r = 1, \ldots ,s \& v_{i} \ge \varepsilon , i = 1, \ldots ,m \\ \end{aligned} $$
(9)

Model (9) is a non-linear mathematical programming model. Its global optimum is not easy to find. Moreover, Model (9) is dependent on the \( \alpha \)-cut and should be solved for different \( \alpha \)-cut levels. We replace the third and fourth constraint of Model (9) by formula (4)–(7). Thus, the transformed Model (10) is as follows:

$$ \begin{aligned} &\kern-4pt\left( {E_{k}^{1} } \right)_{\alpha }^{U} = Max {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rk}^{1} } \\ & s.t. \quad \ {\mathop \sum \limits_{i = 1}^{m} v_{i} x_{ik}^{1} = 1} \\ & \ \ \qquad \ \mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{1} \le 0,\quad j = 1, \ldots ,n \\ &\qquad \ \ \ x_{ij}^{11} + \alpha_{i} \left( {x_{ij}^{12} - x_{ij}^{11} } \right) \le x_{ij}^{1} \le x_{ij}^{14} - \alpha_{i} \left( {x_{ij}^{14} - x_{ij}^{13} } \right) \\ & \quad \qquad y_{rj}^{11} + \alpha_{r} \left( {y_{rj}^{12} - y_{rj}^{11} } \right) \le y_{rj}^{1} \le y_{rj}^{14} - \alpha_{r} \left( {y_{rj}^{14} - y_{rj}^{13} } \right) \\ & u_{r} \ge \varepsilon ,\quad r = 1, \ldots ,s\quad {\text{and}}\quad v_{i} \ge \varepsilon ,\quad i = 1, \ldots ,m \\ \end{aligned} $$
(10)

Letting \( t_{rj}^{1} = u_{r} y_{rj}^{1} \) and \( z_{ij}^{1} = v_{i} x_{ij}^{1} \), then \( \alpha_{i} v_{i} = \delta_{i} \), \( 0 \le \delta_{i} \le v_{i} \), \( \varvec{\alpha}_{\varvec{r}} \varvec{u}_{\varvec{r}} = \tau_{r} \), \( 0 \le \tau_{r} \le u_{r} \). Then, Model (10) is converted to the following Model (11):

$$ \begin{aligned} & \kern-4pt\left( {E_{k}^{1} } \right)_{\alpha }^{U} = Max {\mathop \sum \limits_{r = 1}^{s} t_{rk}^{1} } \\ & s.t.\quad{\mathop \sum \limits_{i = 1}^{m} z_{ik}^{1} = 1} \\ & \quad\quad\, \ \, \mathop \sum \limits_{r = 1}^{s} t_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} z_{ij}^{1} \le 0,\quad j = 1, \ldots ,n \\ &\, \ \,\quad\quad v_{i} x_{ij}^{11} + \delta_{i} \left( {x_{ij}^{12} - x_{ij}^{11} } \right) \le z_{ij}^{1} \le v_{i} x_{ij}^{14} - \delta_{i} \left( {x_{ij}^{14} - x_{ij}^{13} } \right)\forall i,\forall j \\ &\, \ \,\quad \quad u_{r} y_{rj}^{11} + \varepsilon_{r} \left( {y_{rj}^{12} - y_{rj}^{11} } \right) \le t_{rj}^{1} \le u_{r} y_{rj}^{14} - \tau_{r} \left( {y_{rj}^{14} - y_{rj}^{13} } \right)\forall r,\forall j \\ & \, \ \,\quad\quad v_{i} \ge \varepsilon ,\quad i = 1, \ldots ,m\quad {\text{and}}\quad u_{r} \ge \varepsilon ,\quad r = 1, \ldots ,s \\ &\quad\quad \, \ \, v_{i} \ge \delta_{i} \ge \varepsilon ,\quad i = 1, \ldots ,m\quad {\text{and}}\quad u_{r} \ge \tau_{r} \ge \varepsilon ,\quad \quad r = 1, \ldots ,s \\ & \quad\quad \ \,\, z_{ij}^{1} \ge 0,\quad \forall i,\forall j \quad {\text{and}}\quad t_{rj}^{1} \ge 0,\quad \forall r,\forall j \\ \end{aligned} $$
(11)

After an optimal solution \( \left( {v_{i}^{*} ,u_{r}^{*} ,\delta_{i}^{*} ,\tau_{r}^{*} ,z_{ij}^{1*} ,t_{rj}^{1*} } \right) \) is solved, the optimal inputs and outputs in sub-system 1 are calculated as \( x_{ij}^{1*} = \frac{{z_{ij}^{1*} }}{{v_{i}^{*} }} \) and \( y_{rj}^{1*} = \frac{{t_{rj}^{1*} }}{{u_{r}^{*} }} \).

3.3 Lower-bound of the efficiency values for sub-system 1

When stage 1 is a leader, Model (13) is proposed to calculate the lower bound \( \left( {E_{k}^{1} } \right)_{\alpha }^{L} \) of the efficiency scores for sub-system 1 for an arbitrary \( \alpha \)-cut level.

$$ \left( {E_{k}^{1} } \right)_{\alpha }^{L} = \mathop {\hbox{min} }\limits_{{\begin{array}{*{20}c} {\left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{L} \le x_{ij}^{1} \le \left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{U} } \\ {\left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{L} \le y_{rj}^{1} \le \left( {y_{rj}^{1} } \right)_{r}^{U} } \\ \end{array} }} \left[ {\begin{array}{*{20}c} {Max \mathop \sum \limits_{r = 1}^{s} u_{r} y_{rk}^{1} } \\ {s.t. \mathop \sum \limits_{i = 1}^{m} \nu_{i} x_{ik}^{1} = 1} \\ {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} \nu_{i} x_{ij}^{1} \le 0,j = 1, \ldots ,n} \\ {u_{r} \ge \varepsilon ,r = 1, \ldots ,s} \\ {\nu_{i} \ge \varepsilon ,i = 1, \ldots ,m} \\ \end{array} } \right] $$
(12)

The two-level optimization Model (12) can’t be solved in its current form. It should be reduced to a single-level optimization model. However, it is difficult to convert it to a single-level optimization Model as the inner and outer programs of Model (8) have the opposing direction of optimization. Thus, some transformations are required to before reducing Model (12) to a single-level optimization Model.

The dual of the inner program in Model (12) is as follows:

$$ \begin{aligned} & Min{*{20}c} \theta \\ & s.t.\quad {\mathop \sum \limits_{j = 1}^{n} \lambda_{j} y_{rj}^{1} \ge y_{rk}^{1} } ,\quad r = 1, \ldots ,s \\ &\quad \quad \ \, \mathop \sum \limits_{j = 1}^{n} \lambda_{j} x_{ij}^{1} \le \theta x_{ik}^{1} ,\quad i = 1, \ldots ,m \\ & \lambda_{j} \ge 0,\quad j = 1, \ldots ,n\quad {\text{and}}\quad \theta {free} \\ \end{aligned} $$
(13)

Because the objectives of inner and the outer optimization are in the form of minimization, the two level optimization of Model (13) can be reduced to a single-level optimization programming Model (14) as follows:

$$ \begin{aligned} & \left( {E_{k}^{1} } \right)_{\alpha }^{L} = Min \theta \\ & s.t.\quad{\mathop \sum \limits_{j = 1}^{n} \lambda_{j} y_{rj}^{1} \ge y_{rk}^{1} } ,\quad r = 1, \ldots ,s \\ & \ \, \quad \quad{\mathop \sum \limits_{j = 1}^{n} \lambda_{j} x_{ij}^{1} \le \theta x_{ik}^{1} } ,\quad i = 1, \ldots ,m \\ &\quad \quad \ \, \left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{L} \le x_{ij}^{1} \le \left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{U} \\ & \quad \quad \ \, \left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{L} \le y_{rj}^{1} \le \left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{U} \\ & \quad\quad \ \, \lambda_{j} \ge 0,\quad j = 1, \ldots ,n\quad {\text{and}}\quad \theta \,free \\ \end{aligned} $$
(14)

We construct Model (15) by replacing the third and fourth constraint of Model (14). Thus, the transformed Model (15) is as follows:

$$ \begin{aligned} & \left( {E_{k}^{1} } \right)_{\alpha }^{L} = Min \theta \\ & s.t.\quad {\mathop \sum \limits_{j = 1}^{n} \lambda_{j} y_{rj}^{1} \ge y_{rk}^{1} } ,\quad r = 1, \ldots ,s \\ & \quad \quad\, \ \, {\mathop \sum \limits_{j = 1}^{n} \lambda_{j} x_{ij}^{1} \le \theta x_{ik}^{1} } ,\quad i = 1, \ldots ,m \\ &\, \quad\ \, \quad x_{ij}^{11} + \alpha_{i} \left( {x_{ij}^{12} - x_{ij}^{11} } \right) \le x_{ij}^{1} \le x_{ij}^{14} - \alpha_{i} \left( {x_{ij}^{14} - x_{ij}^{13} } \right) \\ & \quad\, \ \,\quad y_{rj}^{11} + \alpha_{r} \left( {y_{rj}^{12} - y_{rj}^{11} } \right) \le y_{rj}^{1} \le y_{rj}^{14} - \alpha_{r} \left( {y_{rj}^{14} - y_{rj}^{13} } \right) \\ &\quad \quad \ \,\, \lambda_{j} \ge 0,\quad j = 1, \ldots ,n\quad {\text{and}}\quad \theta {free} \\ \end{aligned} $$
(15)

Model (15) is nonlinear and has \( m + s \) nonlinear constraints. As the size of the non-linear programming is small, most commercial nonlinear programming solvers can be used to solve the program.

Letting \( \lambda_{j} y_{rj}^{1} = y_{rj}^{1\prime} \), \( \lambda_{j} x_{ij}^{1} = x_{ij}^{1\prime} \), \( \nu_{i} x_{ij}^{1} = \bar{x}_{ij}^{1} \), \( \mu_{r} y_{rj}^{1} = \bar{y}_{rj}^{1} \), \( \alpha_{i} \nu_{i} = \delta_{i} \), \( 0 \le \delta_{i} \le \nu_{i} \), \( \varvec{\alpha}_{\varvec{r}} \varvec{u}_{\varvec{r}} = \tau_{r} \), \( 0 \le \tau_{r} \le \mu_{r} \), Model (15) can be converted to the following Model (16):

$$ \begin{aligned} & \left( {E_{k}^{1} } \right)_{\alpha }^{L} = Min \theta \\ & s.t.\quad {\mathop \sum \limits_{j = 1}^{n} y_{rj}^{1\prime} \ge y_{rk}^{1} } ,\quad r = 1, \ldots ,s \\ & \quad \ \,\, \quad {\mathop \sum \limits_{j = 1}^{n} x_{ij}^{1\prime} \le \theta x_{ik}^{1} } ,\quad i = 1, \ldots ,m \\ & \quad \quad\ \,\, x_{ij}^{11} + \delta_{i} \left( {x_{ij}^{12} - x_{ij}^{11} } \right) \le \bar{x}_{ij}^{1} \le \nu_{i} x_{ij}^{14} - \delta_{i} \left( {x_{ij}^{14} - x_{ij}^{13} } \right)\forall i\forall j \\ & \quad \quad \ \,\, u_{r} y_{rj}^{11} + \tau_{r} \left( {y_{rj}^{12} - y_{rj}^{11} } \right) \le \bar{y}_{rj}^{1} \le u_{r} y_{rj}^{14} - \tau_{r} \left( {y_{rj}^{14} - y_{rj}^{13} } \right)\forall r\forall j \\ & \ \,\,\quad \quad \nu_{i} \ge 0,\quad i = 1, \ldots ,m \quad {\text{and}}\quad u_{r} \ge 0,\quad r = 1, \ldots ,s \\ & \quad \quad\ \,\, \nu_{i} \ge \delta_{i} \ge 0,\quad i = 1, \ldots ,m\quad {\text{and}}\quad u_{r} \ge \tau_{r} \ge 0,\quad r = 1, \ldots ,s \\ &\quad \quad\ \,\, y_{rj}^{1\prime} \ge 0,\quad r = 1, \ldots ,s;\quad j = 1, \ldots ,n\quad {\text{and}}\quad x_{ij}^{1\prime} \ge 0,\quad i = 1, \ldots ,m; \\ & \qquad\ \,\, j = 1, \ldots ,n\quad \theta {free} \\ \end{aligned} $$
(16)

The variables of Model (16) are \( \theta \), \( x_{ij}^{1\prime} \), \( y_{rj}^{1\prime} \), \( v_{i} \), \( \delta_{i} \), \( u_{r} \) and \( \tau_{r} \). Model (16) is a linear program and easy to solve.

3.4 Maximum achievable value of the upper bound of efficiency for the whole system

As stage 1 is leader, the upper bound of efficiency for stage 1 is optimized in priority by Model (11). Model (17) is proposed to calculate the maximum achievable value of the upper-bound of efficiency for the whole system as Model (18).

$$ \left( {E_{k}^{O} } \right)_{\alpha }^{U} = \mathop { \hbox{max} }\limits_{{\begin{array}{ll} {\left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{L} \le x_{ij}^{1} \le \left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{U} } \\ {\left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{L} \le y_{rj}^{1} \le \left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{U} } \\ {\left( {x_{ij}^{2} } \right)_{{\alpha_{i} }}^{L} \le x_{ij}^{2} \le \left( {x_{ij}^{2} } \right)_{{\alpha_{i} }}^{U} } \\ {\left( {y_{rj}^{2} } \right)_{{\alpha_{r} }}^{L} \le y_{rj}^{2} \le \left( {y_{rj}^{2} } \right)_{{\alpha_{r} }}^{U} } \\ \end{array} }} \left[ {\begin{array}{*{20}c} {\hbox{max} \mathop \sum \limits_{r = 1}^{s} u_{r} Y_{rk}^{s} } \\ {s.t. \mathop \sum \limits_{i = 1}^{m} v_{i} X_{ik}^{s} = 1 } \\ {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{1} \le 0,j = 1, \ldots ,n} \\ {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{2} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{2} \le 0,j = 1, \ldots ,n} \\ {e_{k}^{1} = \left( {E_{k}^{1*} } \right)_{\alpha }^{U} } \\ {u_{r} \ge \varepsilon ,r = 1, \ldots ,s} \\ {v_{i} \ge \varepsilon ,i = 1, \ldots ,m} \\ \end{array} } \right] $$
(17)

Model (17) is a two-level optimization program. The inner and outer programs of Model (17) have the same direction of optimization, i.e., maximization; they can be reduced to the following one-level optimization program (18):

$$ \begin{aligned} & Max {\mathop \sum \limits_{r = 1}^{s} u_{r} Y_{rk}^{s} } \\ & s.t.\quad {\mathop \sum \limits_{i = 1}^{m} v_{i} X_{ik}^{s} = 1} \\ & \quad \ \,\,\quad {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{1} \le 0,\quad j = 1, \ldots ,n} \\ & \quad\ \,\, \quad {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{2} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{2} \le 0,\quad j = 1, \ldots ,n} \\ & \ \,\,\quad \quad {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rk}^{1} = \left( {E_{k}^{1*} } \right)_{\alpha }^{U} \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ik}^{1} } \\ & \quad \quad\ \,\, {x_{ij}^{11} + \alpha_{i} \left( {x_{ij}^{12} - x_{ij}^{11} } \right) \le x_{ij}^{1} \le x_{ij}^{14} - \alpha_{i} \left( {x_{ij}^{14} - x_{ij}^{13} } \right)} \\ & \quad \quad\ \,\, { {y_{rj}^{11} + \alpha_{r} \left( {y_{rj}^{12} - y_{rj}^{11} } \right) \le y_{rj}^{1} \le y_{rj}^{14} - \alpha_{r} \left( {y_{rj}^{14} - y_{rj}^{13} } \right)} } \\ & \quad \quad\ \,\, x_{ij}^{21} + \alpha_{i} \left( {x_{ij}^{22} - x_{ij}^{21} } \right) \le x_{ij}^{2} \le x_{ij}^{24} - \alpha_{i} \left( {x_{ij}^{24} - x_{ij}^{23} } \right) \\ & \quad \quad\ \,\, {y_{rj}^{21} + \alpha_{r} \left( {y_{rj}^{22} - y_{rj}^{21} } \right) \le y_{rj}^{2} \le y_{rj}^{24} - \alpha_{r} \left( {y_{rj}^{24} - y_{rj}^{23} } \right)} \\ & \quad \quad\ \,\, {u_{r} \ge \varepsilon ,\quad r = 1, \ldots ,s} \quad {\text{and}}\quad v_{i} \ge \varepsilon ,\quad i = 1, \ldots ,m \\ \end{aligned} $$
(18)

Letting \( v_{i} x_{ij}^{1} = \overline{{x_{ij}^{1} }} \), \( u_{r} y_{rj}^{1} = \overline{{y_{rj}^{1} }} \), \( v_{i} x_{ij}^{2} = \overline{{x_{ij}^{2} }} \), \( u_{r} y_{rj}^{2} = \overline{{y_{rj}^{2} }} \), \( \alpha_{i} v_{i} = \delta_{i} \), \( 0 \le \delta_{i} \le v_{i} \), \( \alpha_{r} u_{r} = \tau_{r} \), \( 0 \le \tau_{r} \le u_{r} \), Model (18) can be converted to the following linear program:

$$ \begin{aligned} & Max {\mathop \sum \limits_{r = 1}^{s} \left( {\overline{{y_{rj}^{1} }} + \overline{{y_{rj}^{2} }} } \right)} \\ & s.t.\quad {\mathop \sum \limits_{i = 1}^{m} \left( {\overline{{x_{ij}^{1} }} + \overline{{x_{ij}^{2} }} } \right) = 1} \\ & \quad \quad\ \,\, {\mathop \sum \limits_{r = 1}^{s} \overline{{y_{rj}^{1} }} - \mathop \sum \limits_{i = 1}^{m} \overline{{x_{ij}^{1} }} \le 0,\quad j = 1, \ldots ,n} \\ & \quad \quad \ \,\, {\mathop \sum \limits_{r = 1}^{s} \overline{{y_{rj}^{2} }} - \mathop \sum \limits_{i = 1}^{m} \overline{{x_{ij}^{2} }} \le 0,\quad j = 1, \ldots ,n} \\ & \quad \quad \ \,\, {\mathop \sum \limits_{r = 1}^{s} \overline{{y_{rk}^{1} }} = \left( {E_{k}^{1*} } \right)_{\alpha }^{U} \mathop \sum \limits_{i = 1}^{m} \overline{{x_{ik}^{1} }} } \\ & \quad \quad \ \,\,{v_{i} x_{ij}^{11} + \delta_{i} \left( {x_{ij}^{12} - x_{ij}^{11} } \right) \le \overline{{x_{ij}^{1} }} \le v_{i} x_{ij}^{14} - \delta_{i} \left( {x_{ij}^{14} - x_{ij}^{13} } \right)}\\ & \quad \quad \ \,\,{ {u_{r} y_{rj}^{11} + \tau_{r} \left( {y_{rj}^{12} - y_{rj}^{11} } \right) \le \overline{{y_{rj}^{1} }} \le u_{r} y_{rj}^{14} - \tau_{r} \left( {y_{rj}^{14} - y_{rj}^{13} } \right)} } \\ & \quad \quad \ \,\,{ {v_{i} x_{ij}^{21} + \delta_{i} \left( {x_{ij}^{22} - x_{ij}^{21} } \right) \le \overline{{x_{ij}^{2} }} \le v_{i} x_{ij}^{24} - \delta_{i} \left( {x_{ij}^{24} - x_{ij}^{23} } \right)} } \\ & \quad \quad \ \,\,{u_{r} y_{rj}^{21} + \tau_{r} \left( {y_{rj}^{22} - y_{rj}^{21} } \right) \le \overline{{y_{rj}^{2} }} \le u_{r} y_{rj}^{24} - \tau_{r} \left( {y_{rj}^{24} - y_{rj}^{23} } \right)} \\ & \quad \quad \ \,\,{u_{r} \ge \varepsilon ,\quad r = 1, \ldots ,s} \quad {\text{and}}\quad v_{i} \ge \varepsilon ,\quad i = 1, \ldots ,m \\ & \quad \quad \ \,\,\varepsilon \le \delta_{i} \le v_{i} ,\quad i = 1, \ldots ,m\quad {\text{and}}\quad \varepsilon \le \tau_{r} \le u_{r} ,\quad r = 1, \ldots ,s \\ \end{aligned} $$
(19)

The resulting Model (19) is linear. Thus, it can achieve the global optimum for the lower bound of the DMU’s efficiency scores.

3.5 Maximum achievable value of the lower-bound of efficiency for the whole system

Model (20) is proposed to calculate the maximum achievable value of the lower-bound of efficiency for the whole system is as follows:

$$ \left( {E_{k}^{O} } \right)_{\alpha }^{L} = \mathop { \hbox{min} }\limits_{{\begin{array}{*{20}l} {\left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{L} \le x_{ij}^{1} \le \left( {x_{ij}^{1} } \right)_{{\alpha_{i} }}^{U} } \\ {\left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{L} \le y_{rj}^{1} \le \left( {y_{rj}^{1} } \right)_{{\alpha_{r} }}^{U} } \\ {\left( {x_{ij}^{2} } \right)_{{\alpha_{i} }}^{L} \le x_{ij}^{2} \le \left( {x_{ij}^{2} } \right)_{{\alpha_{i} }}^{U} } \\ {\left( {y_{rj}^{2} } \right)_{{\alpha_{r} }}^{L} \le y_{rj}^{2} \le \left( {y_{rj}^{2} } \right)_{{\alpha_{r} }}^{U} } \\ \end{array} }} \left[ {\begin{array}{*{20}c} {\hbox{max} \mathop \sum \limits_{r = 1}^{s} u_{r} Y_{rk}^{s} } \\ {s.t. \mathop \sum \limits_{i = 1}^{m} v_{i} X_{ik}^{s} = 1 } \\ {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{1} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{1} \le 0,j = 1, \ldots ,n} \\ {\mathop \sum \limits_{r = 1}^{s} u_{r} y_{rj}^{2} - \mathop \sum \limits_{i = 1}^{m} v_{i} x_{ij}^{2} \le 0,j = 1, \ldots ,n} \\ {e_{k}^{1} = \left( {E_{k}^{1*} } \right)_{\alpha }^{U} } \\ {u_{r} \ge \varepsilon ,r = 1, \ldots ,s} \\ {v_{i} \ge \varepsilon ,i = 1, \ldots ,m} \\ \end{array} } \right] $$
(20)

The constraint of Model (20) satisfies the lower-bound of the efficiency score for sub-system 1, achieved by Model (15). It is considered as \( e_{k}^{1} = \left( {E_{k}^{1*} } \right)_{\alpha }^{L} \) in Model (20).

Model (20) cannot be solved in its current form because the two objective functions in this model are in opposing directions. By defining the dual of the inner optimization problem of Model (20), we obtain the following model:

$$ \begin{aligned} & Min \theta \\ & s.t.\quad \mathop \sum \limits_{j = 1}^{n} \lambda_{j}^{1} y_{rj}^{1} + \mathop \sum \limits_{j = 1}^{n} \lambda_{j}^{2} y_{rj}^{2} + \gamma y_{rk}^{1} \ge Y_{rk}^{s} ,\quad \forall r \\ & \quad\quad\ \,\, {\mathop \sum \limits_{j = 1}^{n} \lambda_{j}^{1} x_{ij}^{1} + \mathop \sum \limits_{j = 1}^{n} \lambda_{j}^{2} x_{ij}^{2} + \left( {E_{k}^{s*} } \right)_{\alpha }^{L} \gamma x_{ik}^{1} } \le \theta X_{ik}^{s} ,\quad \forall i \\ & {\lambda_{j}^{1} } \ge 0, {\lambda_{j}^{2} } \ge 0,\forall j\quad {\text{and}}\quad \theta ,\quad \gamma {free} \\ \end{aligned} $$
(21)

Model (21) can be transformed to be the following form by replacing \( x_{ij}^{1} \), \( x_{ij}^{2} \), \( y_{rj}^{1} \) and \( y_{rj}^{2} \) by formula (4)–(7):

$$ \begin{aligned} & Min\begin{array}{*{20}c} \theta \\ \end{array} \\ & s.t.\quad \mathop \sum \limits_{j = 1}^{n} \lambda_{j}^{1} y_{rj}^{1} + \mathop \sum \limits_{j = 1}^{n} \lambda_{j}^{2} y_{rj}^{2} + \gamma y_{rk}^{1} \ge Y_{rk}^{s} ,\quad \forall r \\ & \ \ \quad \quad \mathop \sum \limits_{j = 1}^{n} \lambda_{j}^{1} x_{ij}^{1} + \mathop \sum \limits_{j = 1}^{n} \lambda_{j}^{2} x_{ij}^{2} + \left( {E_{k}^{s*} } \right)_{\alpha }^{L} \gamma x_{ik}^{1} \le \theta X_{ik}^{s} ,\quad \forall i \\ & \, \quad \ \quad \, \begin{array}{*{20}c} {x_{ij}^{11} + \alpha_{i} \left( {x_{ij}^{12} - x_{ij}^{11} } \right) \le x_{ij}^{1} \le x_{ij}^{14} - \alpha_{i} \left( {x_{ij}^{14} - x_{ij}^{13} } \right)} \\ \end{array} \\ & \, \, \quad \ \quad \begin{array}{*{20}c} {y_{rj}^{11} + \alpha_{r} \left( {y_{rj}^{12} - y_{rj}^{11} } \right) \le y_{rj}^{1} \le y_{rj}^{14} - \alpha_{r} \left( {y_{rj}^{14} - y_{rj}^{13} } \right)} \\ \end{array} \\ & \, \quad \, \quad \ \begin{array}{*{20}c} {x_{ij}^{21} + \alpha_{i} \left( {x_{ij}^{22} - x_{ij}^{21} } \right) \le x_{ij}^{2} \le x_{ij}^{24} - \alpha_{i} \left( {x_{ij}^{24} - x_{ij}^{23} } \right)} \\ \end{array} \\ & \, \, \quad \quad \ \begin{array}{*{20}c} {y_{rj}^{21} + \alpha_{r} \left( {y_{rj}^{22} - y_{rj}^{21} } \right) \le y_{rj}^{2} \le y_{rj}^{24} - \alpha_{r} \left( {y_{rj}^{24} - y_{rj}^{23} } \right)} \\ \end{array} \\ & \, \, \quad \ \quad \begin{array}{*{20}c} {\lambda_{j}^{1} } \\ \end{array} \ge 0,\begin{array}{*{20}c} {\lambda_{j}^{2} } \\ \end{array} \ge 0,\forall j\quad {\text{and}}\quad \begin{array}{*{20}c} {\theta ,\quad \gamma \begin{array}{*{20}c} {free} \\ \end{array} } \\ \end{array} \\ \end{aligned} $$
(22)

Letting \( \lambda_{j} y_{rj}^{1} = y_{rj}^{1\prime} \), \( \lambda_{j} x_{ij}^{1} = x_{ij}^{1\prime} \), \( \lambda_{j} y_{rj}^{2} = y_{rj}^{2\prime} \), \( \lambda_{j} x_{ij}^{2} = x_{ij}^{2\prime} \), \( \nu_{i} x_{ij}^{1} = \overline{{x_{ij}^{1} }} \), \( \mu_{r} y_{rj}^{1} = \overline{{y_{rj}^{1} }} \), \( \nu_{i} x_{ij}^{2} = \overline{{x_{ij}^{2} }} \), \( \mu_{r} y_{rj}^{2} = \overline{{y_{rj}^{2} }} \), \( \alpha_{i} \nu_{i} = \delta_{i} \), \( 0 \le \delta_{i} \le \nu_{i} \), \( \alpha_{r} \mu_{r} = \tau_{r} \), \( 0 \le \tau_{r} \le \mu_{r} \), Model (22) can be transformed to the following form (23):

$$ \begin{aligned} & Min \theta \\ & s.t.\quad \mathop \sum \limits_{j = 1}^{n} y_{rj}^{1\prime} + \mathop \sum \limits_{j = 1}^{n} y_{rj}^{2\prime} + \gamma y_{rk}^{1} \ge Y_{rk}^{s} ,\quad \forall r \\ & \ \,\,\quad \quad {\mathop \sum \limits_{j = 1}^{n} x_{ij}^{1\prime} + \mathop \sum \limits_{j = 1}^{n} x_{ij}^{2\prime} + \left( {E_{k}^{s*} } \right)_{\alpha }^{L} \gamma x_{ik}^{1} } \ \le \theta X_{ik}^{s} ,\quad \quad \forall i \\ & \ \,\,\quad \quad {\nu_{i} x_{ij}^{11} + \delta_{i} \left( {x_{ij}^{12} - x_{ij}^{11} } \right) \le \overline{{x_{ij}^{1} }} \le \nu_{i} x_{ij}^{14} - \delta_{i} \left( {x_{ij}^{14} - x_{ij}^{13} } \right)\forall i\forall j} \\ & \quad \ \,\,\quad {\mu_{r} y_{rj}^{11} + \tau_{r} \left( {y_{rj}^{12} - y_{rj}^{11} } \right) \le \overline{{y_{rj}^{1} }} \le \mu_{r} y_{rj}^{14} - \tau_{r} \left( {y_{rj}^{14} - y_{rj}^{13} } \right)\forall r\forall j} \\ & \quad \ \,\,\quad {\nu_{i} x_{ij}^{21} + \delta_{i} \left( {x_{ij}^{22} - x_{ij}^{21} } \right) \le \overline{{x_{ij}^{2} }} \le \nu_{i} x_{ij}^{24} - \delta_{i} \left( {x_{ij}^{24} - x_{ij}^{23} } \right)\forall i\forall j} \\ & \quad \ \,\,\quad {\mu_{r} y_{rj}^{21} + \tau_{r} \left( {y_{rj}^{22} - y_{rj}^{21} } \right) \le \overline{{y_{rj}^{2} }} \le \mu_{r} y_{rj}^{24} - \tau_{r} \left( {y_{rj}^{24} - y_{rj}^{23} } \right)\forall r\forall j} \\ & \quad \ \,\,\quad \nu_{i} \ge 0,\quad i = 1, \ldots ,m \quad {\text{and}}\quad u_{r} \ge 0,\quad r = 1, \ldots ,s \\ & \quad \ \,\,\quad \nu_{i} \ge \delta_{i} \ge 0,\quad i = 1, \ldots ,m\quad {\text{and}}\quad u_{r} \ge \tau_{r} \ge 0,\quad r = 1, \ldots ,s \\ & \quad \ \,\,\quad y_{rj}^{1\prime} \ge 0,y_{rj}^{{2^{\prime}}} \ge 0,\quad r = 1, \ldots ,s;\quad j = 1, \ldots ,n\quad {\text{and}}\quad x_{ij}^{{1^{\prime}}} \ge 0,\quad x_{ij}^{{2^{\prime}}} \ge 0, \\ & \qquad \ \,\,\quad i = 1, \ldots ,m;\quad j = 1, \ldots ,n \\ & \quad \ \,\,\quad {\theta ,\gamma{*{20}c} {free} } \\ \end{aligned} $$
(23)

Model (23) is a non-linear program. But the program has \( m + s \) nonlinear constraints (usually not more than 10). Thus, the number of nonlinear constraints is of small size in the standard of nonlinear programming, most commercial non-linear programming solvers can be applied to solve it.

In a similar manner, if the second stage is assumed to be a leader, then the upper and lower bound of efficiency for stage 2 should be calculated at first. Then, one maximizes (or minimizes) the efficiency of the whole system, with the restriction that the upper bound (or the lower bound) of the second stage’s efficiency score, having already been determined.

4 An application in education

In this section, the proposed approach will be illustrated by a case of parallel process system with two processes. As universities usually have two functions-research and teaching. For each university, various amounts of resources are allocated to these functions depending on the emphasis of the university. In what follows, we use examples of Beasley (1995) and Kao and Lin (2012) to present our method.

In Beasley (1995) and Kao and Lin (2012), the teaching and research efficiencies of 52 chemistry and physics departments of UK universities are studied. The inputs of these 52 chemistry and physics departments are: general expenditure (X1), equipment expenditure (X2), and research income (X3), where the first two are shared by teaching and research, and the third is specific to research. Beasley (1995) allowed each university to assign various proportions, \( q_{1} \) and \( q_{2} \), of the two inputs to teaching and research respectively. These two proportions are restricted to the range of 0.3 and 0.9. Moreover, each input is required to have a minimum portion of 0.05 used for teaching and research. There are five outputs: number of undergraduates (UGs) (Y1), number of taught postgraduates (PGs-T) (Y2), number of research postgraduates (PGs-R) (Y3), research income—as a proxy for research output in terms of quantity (Y4), a rating for research quality, which is classified as star (Y5), A + (Y6), A (Y7), and A − (Y8). The first two are the outputs of teaching function, and the remaining are the outputs of research function. The parallel production system is depicted in Fig. 2.

Fig. 2
figure 2

Parallel system of the university department with two functions of teaching and research

The research is a qualitative factor whose ratings (Y5 − Y8) are imprecise. Thus, research quality can be expressed by fuzzy numbers. In order to find appropriate fuzzy numbers to represent these ratings, we follow Kao and Lin (2011) to determine the fuzzy numbers of the ratings. The trapezoidal fuzzy numbers are obtained for four ratings: \( \tilde{Y}_{5} = \left( {2.133,2.133,2.228,2.936} \right) \), \( \tilde{Y}_{6} = \left( {0.593,1.067,1.111,1.212} \right) \), \( \tilde{Y}_{7} = \left( {0.243,0.533,0.556,0.556} \right) \) and \( \tilde{Y}_{8} = \left( {0.111,0.111,0.267,0.267} \right) \). The proposed new parallel fuzzy data envelopment analysis models for parallel systems with two components based on Stackelberg game theory were coded using Matlab software. The results are summarized in Tables 2 and 3. When stage 1 is a leader, Table 2 shows the lower and upper bound of the overall score (i.e., \( \left( {E^{S} } \right)_{\alpha }^{L} \) and \( \left( {E^{S} } \right)_{\alpha }^{U} \)), the lower and upper bound of the partial efficiencies in the first stage (i.e., \( \left( {E^{1 + } } \right)_{\alpha }^{L} \) and \( \left( {E^{1 + } } \right)_{\alpha }^{U} \)), and the lower and upper bound of the partial efficiencies in the second stage (i.e., \( \left( {E^{2} } \right)_{\alpha }^{L} \) and \( \left( {E^{2} } \right)_{\alpha }^{U} \)), for each chemistry department. When stage 2 is a leader, Table 3 shows the lower and upper bound of the overall score (i.e., \( \left( {E^{S} } \right)_{\alpha }^{L} \) and \( \left( {E^{S} } \right)_{\alpha }^{U} \)), the lower and upper bound of the partial efficiencies in the first stage (i.e., \( \left( {E^{1} } \right)_{\alpha }^{L} \) and \( \left( {E^{1} } \right)_{\alpha }^{U} \)), and the lower and upper bound of the partial efficiencies in the second stage (i.e., \( \left( {E^{2 + } } \right)_{\alpha }^{L} \) and \( \left( {E^{2 + } } \right)_{\alpha }^{U} \)), for each chemistry department. The fuzzy efficiencies for parallel systems based on Kao and Lin’s (2012) approach are summarized in Table 1 for comparison.

Table 1 Fuzzy efficiencies for the chemistry department example based on Kao and Lin’s (2012) approach

From Table 2, we find that the lower and upper bounds of the overall efficiency scores of the main DMUs are less than unity except for \( DMU_{42} \). Further investigation into the efficiency scores of the first and second stages revealed the cause of these inefficiencies. For example, the efficiency score of the main \( DMU_{6} \) is a value which belongs to the interval \( \left[ {0.7536,0.9967} \right] \). This means that \( DMU_{6} \) is inefficient as long as we are making a decision under uncertainty. We can further investigate the source of this inefficiency in the two parallel sub-systems. The efficiency score of sub-system 1 of \( DMU_{6} \) is a value which belongs to the interval \( \left[ {1,1} \right] \). This means that \( DMU_{6} \)’s sub-system 1 is always efficient regardless of varying its inputs and outputs in uncertain situation. Thus, it can be concluded that the source of variation in the efficiency score of the main \( DMU_{6} \) is not its sub-system 1. The efficiency of the sub-system 2 of \( DMU_{6} \) is a value which belongs to the interval \( \left[ {0.4329,0.5326} \right] \). This means that the source of inefficiencies of main \( DMU_{6} \) is mainly from its sub-system 2.

Table 2 The fuzzy efficiencies of DMUs when stage 1 is a leader

From Tables 2 and 3, it can be found that when stage 1 is a leader, the upper bounds of the efficiency scores \( \left( {E_{k}^{1 + } } \right)_{\alpha }^{U} \) and the lower bounds of the efficiency scores \( \left( {E_{k}^{1 + } } \right)_{\alpha }^{L} \) are larger than the corresponding upper and lower bounds of the efficiency scores when stage 2 is a leader, respectively. This is because the efficiency of stage 1 is optimized in priority when stage 1 is a leader. When stage 2 is a leader and stage 1 is a follower, we can draw the same conclusion.

Table 3 The fuzzy efficiencies of DMUs when stage 2 is a leader

By comparing Tables 2 and 3 with Table 1, some conclusions can be drawn. When stage 1 is a leader, the lower bounds of the efficiency scores \( \left( {E_{k}^{1 + } } \right)_{\alpha }^{L} \) and the upper bounds of the efficiency scores \( \left( {E_{k}^{1 + } } \right)_{\alpha }^{U} \) computed by our method are no less than the corresponding efficiency scores computed by Kao and Lin’s (2012) approach, and the lower bounds of the efficiency scores \( \left( {E_{k}^{2} } \right)_{\alpha }^{L} \) and the upper bounds of the efficiency scores \( \left( {E_{k}^{2} } \right)_{\alpha }^{U} \) computed by our method are no larger than the corresponding efficiency scores computed by Kao and Lin’s (2012) approach. This may be due to that when stage 1 is a leader, its lower bounds of the efficiency scores \( \left( {E_{k}^{1 + } } \right)_{\alpha }^{L} \) and the upper bounds of the efficiency scores \( \left( {E_{k}^{1 + } } \right)_{\alpha }^{U} \) are maximized in priority. When stage 2 is a leader, similar conclusions can be drawn.

5 Conclusion and direction for future studies

The conventional DEA models view DMUs as “Black Boxes” and don’t consider the internal structure of DMUs. In this paper, we introduced a new parallel data envelopment analysis model for parallel systems with two components based on Stackelberg game theory with uncertain inputs and outputs. This is the first paper which introduced Stackelberg (leader–follower) game theory to the efficiency evaluation of parallel systems. The fuzzy input and output data is modeled by trapezoidal fuzzy numbers (TrFNs). The Stackelberg (leader–follower) game theory is introduced to prioritize and sequentially decompose the whole parallel system’s efficiency score of each DMU into a set of efficiency scores for its sub-systems. The lower and upper bounds of the efficiencies scores are calculated using the proposed new parallel fuzzy DEA model. A case study based on Beasley (1995) and Kao and Lin (2012) is presented to show our method. Finally, some other fuzzy DEA approaches (Emrouznejad et al. 2014; Emrouznejad and Mustafa 2010) can also be extended to the parallel process context.