Abstract
Data Envelopment Analysis (DEA) is a deterministic method for the aggregation of multidimensional measures and subsequent efficiency analysis. Due to its inherent determinism, however, it reacts sensitively to outliers in datasets. Existing methods for identifying such outliers have two main disadvantages. First, from a more conceptional point of view, a uniform definition of an outlier is missing. Second, there are technical disadvantages of each method. For instance, arbitrarily limited values have to be set by the user, like the amount of efficiency value from which on a decision making unit is regarded as an outlier. This paper initially presents a definition of outliers, which explicitly takes the specifics of DEA into account. Based on this definition, an approach for identifying outliers in DEA is introduced which explicitly tackles the technical disadvantages and takes them into account in the developed algorithm. The plausibility of this approach is validated on the basis of empirical examples from performance measurement at the university level.
Similar content being viewed by others
Notes
It should be mentioned that there are also models which extend DEA with stochastic models (e.g. Jradi and Ruggiero 2019). However, such kind of combined models are not part of our paper.
For a detailed overview of DEA applications see, e.g., Liu et al. (2013).
Liu et al. (2019) recently propose a new cross-efficiency evaluation based on prospect theory and provide an empirical example regarding selected Chines universities.
In general, the types of outliers detailed in the following may also occur in the case of a linear data envelope; however, this cannot be represented in a two-dimensional example, why a convex envelope has been selected here.
A DMU is called weak efficient if the DMU can not be improved in at least one of the considered inputs or outputs, but there is still room for improvement in at least one of the other inputs and outputs.
In exceptional cases, there may be no super-efficiency value for both input orientation and output orientation. For instance, in the case of output orientation, the DMU realizes an extremely high output with extremely little input, and is then directly classified as an outlier.
Recently, Agasisti et al. (2019) combine DEA with mehods of Multi Criteria Evaluation to measure and evaluate the efficiency of European education systems.
The data surveys attract great interest within the scientific world (e.g. Usher and Savino 2006; Frey 2007; Marginson and van der Welde 2007; Jarwal et al. 2009; Stolz et al. 2010; Kieser 2012). For a comprehensive, critical discussion of the CHE approach, especially from the perspective of German business administration, see Clermont and Dirksen (2016).
References
Agasisti, T., G. Munda, and R. Hippe. 2019. Measuring the efficiency of European education systems by combining data envelopment analysis and multiple-criteria evaluation. Journal of Productivity Analysis 51:105–124.
Agasiti, T., and C. Pérez-Esparrells. 2010. Comparing efficiency in a cross-country perspective: the case of Italian and Spanish state universities. Higher Education 59:85–103.
Aigner, D., C.A.K. Lovell, and P. Schmidt. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics 6:21–37.
Albers, S. 2015. What drives publication productivity in German business faculties? Publication Productivity 67:6–33.
Andersen, P., and N.C. Petersen. 1993. A procedure for ranking efficient units in Data Envelopment Analysis. Management Science 39:1261–1264.
Aragon, Y., A. Daouia, and C. Thomas-Agnan. 2005. Nonparametric frontier estimation: a conditional quantile-based approach. Econometric Theory 21:358–389.
Avkiran, N.K. 2001. Investigating technical and scale efficiencies of Australian universities through data envelopment analysis. Socio-Economic Planning Sciences 35:57–80.
Bahari, A.R., and A. Emrouznejad. 2014. Influential DMUs and outlier detection in Data Envelopment Analysis with an application to health care. Annals of Operations Research 223:95–108.
Ball, R., B. Mittermaier, and D. Tunger. 2009. Creation of journal-based publication profiles of scientific institutions: a methodology for the interdisciplinary comparison of scientific researcher based on the J‑factor. Scientometrics 81:381–392.
Banker, R.D., and H. Chang. 2006. The super-efficiency procedure for outlier identification, not for ranking efficient units. European Journal of Operational Research 175:1311–1320.
Banker, R.D., A. Charnes, and W.W. Cooper. 1984. Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science 30:1078–1092.
Banker, R.D., S. Das, and S.M. Datar. 1989. Analysis of cost variances for management control in hospitals. Research in Governmental and Nonprofit Accounting 5:269–291.
Barnett, V., and T. Lewis. 1984. Outlier in statistical data, 2nd edn., Chichester: John Wiley & Sons.
Ben-Gal, I. 2005. Outlier detection. In Data mining and knowledge discovery handbook, 2nd edn., ed. O. Maimon, L. Rokach, 117–130. New York: Springer.
Bogetoft, P., and L. Otto. 2011. Benchmarking with DEA, SFA, and R. New York: Springer.
Bolli, T., and M. Farsi. 2015. The dynamics of productivity in Swiss universities. Journal of Productivity Analysis 44:21–38.
Bonesrønning, H., and J. Rattsø. 1994. Efficiency variation among the Norwegian high schools: consequences of equalization policy. Economics of Education Review 13:289–304.
Bourne, M., A. Neely, J. Mills, and K. Platts. 2003. Implementing performance measurement systems: a literature overview. International Journal of Business Performance Management 5:1–24.
Calinski, T., and J. Harabasz. 1974. A dendrite method for cluster analysis. Communications in Statistics: Theory and Methods 3:1–27.
Cazals, C., J.-P. Florens, and L. Simar. 2002. Nonparametric frontier estimation: a robust approach. Journal of Econometrics 106:1–25.
Charnes, A., W.W. Cooper, B. Golany, L. Seiford, and J. Stutz. 1985. Foundations of data envelopment analysis for Pareto-Koopmans efficient empirical production functions. Journal of Econometrics 30:91–107.
Charnes, A., W.W. Cooper, and E. Rhodes. 1978. Measuring the efficiency of decision making units. European Journal of Operational Research 2:429–444.
Clermont, M. 2016. Effectiveness and efficiency of research in Germany over time: an analysis of German business schools between 2001 and 2009. Scientometrics 108:1347–1381.
Clermont, M., and A. Dirksen. 2016. The measurement, evaluation, and publication of performance in higher education: An analysis of the CHE research ranking of business schools in Germany from an accounting perspective. Public Administration Quarterly 40:341–386.
Clermont, M., A. Dirksen, and H. Dyckhoff. 2015. Returns to scale of business administration research in Germany. Scientometrics 103:583–614.
Daghbashyan, Z., E. Deiaco, and M. McKelvey. 2014. How and why does cost efficiency of universities differ across European countries? An explorative attempt using new microdata. In Knowledge, diversity and performance in European higher education: A changing landscape, ed. A. Bonaccorsi, 267–291. Cheltenham & Northampton: Edward Elgar.
Dellnitz, A. 2016. RTS-mavericks in Data Envelopment Analysis. Operations Research Letters 44:622–624.
Dilger, A., and H. Müller. 2012. Ein Forschungsleistungsranking auf der Grundlage von Google Scholar. Zeitschrift für Betriebswirtschaft 82:1089–1105.
Doyle, J., and R. Green. 1994. Efficiency and cross-efficiency in DEA: derivations, meanings and use. Journal of the Operational Research Society 45:567–578.
Dyckhoff, H., and H. Ahn. 2010. Verallgemeinerte DEA-Modelle zur Performanceanalyse. Zeitschrift für Betriebswirtschaft 80:1249–1276.
Dyckhoff, H., and K. Allen. 1999. Theoretische Begründung einer Effizienzanalyse mittels Data Envelopment Analysis (DEA). Zeitschrift für betriebswirtschaftliche Forschung 51:411–436.
Dyckhoff, H., H. Ahn, S. Rassenhövel, and K. Sandfort. 2008. Skalenerträge der Forschung wirtschaftswissenschaftlicher Fachbereiche: Empirische Ergebnisse und ihre Interpretation. Hochschulmanagement 3:62–66.
Dyckhoff, H., S. Rassenhövel, and K. Sandfort. 2009. Empirische Produktionsfunktion betriebswirtschaftlicher Forschung: Eine Analyse der Daten des Centrums für Hochschulentwicklung. Zeitschrift für betriebswirtschaftliche Forschung 61:22–56.
Emrouznejad, A., and G.-L. Yang. 2018. A survey and analysis of the first 40 years of scholarly literatur in DEA: 1978–2016. Socio-Economic Planning Sciences 61:4–8.
Fandel, G. 2007. On the performance of universities in North-Rhine-Westphalia, Germany: Government’s redistribution of funds judged using DEA efficiency measures. European Journal of Operational Research 176:521–533.
Färe, R., S. Grosskopf, B. Lindgren, and P. Roos. 1994. Productivity development in Swedish hospitals: a Malmquist output index approach. In Data envelopment analysis: theory, methodology, and applications, ed. A. Charnes, W.W. Cooper, A.Y. Lewin, and L.M. Seiford, 253–272. New York: Springer.
Färe, R., S. Grosskopf, and W. Weber. 1989. Measuring school district performance. Public Finance Review 17:409–428.
Flegg, A.T., and D.O. Allen. 2007. Does expansion cause congestion? The case of the older British universities, 1994–2004. Education Economics 15:75–102.
Fouchet, R., and M. Guenoun. 2007. Performance management in intermunicipal authorities. International Journal of Public Sector Performance Management 1:62–82.
Franco-Santos, M., L. Lucianetti, and M. Bourne. 2012. Contemporary performance measurement systems: a review of their consequences and a framework for research. Management Accounting Research 23:79–119.
Frey, B.S. 2007. Evaluierungen, Evaluierungen… Evaluitis. Perspektiven der Wirtschaftspolitik 8:207–220.
García-Aracil, A. 2013. Understanding productivity changes in public universities: evidence from Spain. Research Evaluation 22:351–368.
Golany, B., and Y. Roll. 1989. An application procedure for DEA. Omega 17:237–250.
Halkos, G.E., and N.G. Tzeremes. 2011. Measuring economic journals’ citation efficiency: a data envelopment analysis approach. Scientometrics 88:979–1001.
Hammerschmidt, M., R. Wilken, and M. Staat. 2009. Methoden zur Lösung grundlegender Probleme der Datenqualität in DEA-basierten Effizienzanalysen. Die Betriebswirtschaft 69:289–309.
Hawkins, D. 1980. Identification of outliers. London: Chapman and Hall.
Horne, J., and B. Hu. 2008. Estimation of cost efficiency of Australian universities. Mathematics and Computers in Simulation 78:266–275.
Hosseinzadeh Lotfi, F., G.R. Jahanshahloo, M. Khodabakhshi, M. Rostamy-Malkhlifeh, Z. Moghaddas, and M. Vaez-Ghasemi. 2013. A review of ranking Models in data envelopment analysis. Journal of Applied Mathematics https://doi.org/10.1155/2013/492421.
Jarwal, S.D., A.M. Brion, and M.L. King. 2009. Measuring research quality using the journal impact factor, citations and ‘ranked journals’: blunt instruments or inspired metrics? Journal of Higher Education Policy & Management 31:289–300.
Johnes, J. 2006. Data envelopment analysis and its application to the measurement of efficiency in higher education. Economics of Education Review 25:273–288.
Johnson, A.L., and L.F. McGinnis. 2009. The hyperbolic oriented efficiency measure as a remedy to infeasibility of super efficiency models. Journal of the Operational Research Society 60:1511–1517.
Jradi, S., and J. Ruggiero. 2019. Stochastic data envelopment analysis: a quantile regression approach to estimate the production function. European Journal of Operational Research 278:385–393.
Keeney, R.L., K.E. See, and D. von Winterfeldt. 2006. Evaluating academic programs: with applications to U.S. graduate decision science programs. Operations Research 54:813–828.
Kerpen, P. 2016. Praxisorientierte Data Envelopment Analysis. Wiesbaden: Springer.
Kieser, A. 2012. JOURQUAL: Der Gebrauch, nicht der Missbrauch, ist das Problem. Oder: Warum Wirtschaftsinformatik die beste deutschsprachige betriebswirtschaftliche Zeitschrift ist. Die Betriebswirtschaft 72:93–110.
Lampe, H.W., and D. Hilgers. 2015. Trajectories of efficiency measurement: a bibliometric analysis of DEA and SFA. European Journal of Operational Research 240:1–21.
Lisi, I.E. 2015. Translating environmental motivations into performance: the role of environmental performance measurement systems. Management Accounting Research 29:27–44.
Liu, H.-H., Y.-Y. Song, and G.-L. Yang. 2019. Cross-efficiency evaluation in data envelopment analysis based on prospect theory. European Journal of Operational Research 273:364–375.
Liu, J.S., L.Y. Lu, W.M. Lu, and B.J. Lin. 2013. A survey of DEA applications. Omega 40:893–902.
Marginson, S., and M. van der Welde. 2007. To rank or to be ranked: the impact of global rankings in higher education. Journal of Studies in International Education 11:206–329.
Maugeri, S., and J.L. Metzger. 2013. Public action: a question of performance? International Journal of Public Sector Performance Management 2:105–122.
Maulik, U., and S. Bandyopadhyay. 2002. Performance evaluation of some clustering algorithms and validity indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24:1650–1654.
Meeusen, W., and J. van den Broeck. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review 18:435–444.
Milligan, G.W., and M.C. Cooper. 1985. An examination of procedures for determining the number of clusters in a data set. Psychometrika 50:159–179.
Olivares, M., and A. Schenker-Wicki. 2012. The dynamics of productivity in the Swiss and German university sector: a non-parametric analysis that accounts for heterogeneous production. Zürich: University of Zurich.
Ondrich, J., and J. Ruggiero. 2002. Outlier detection in data envelopment analysis: an analysis of jackknifing. Journal of the Operational Research Society 53:342–346.
Peffers, K., T. Tuunanen, M.A. Rothenberger, and S. Chatterjee. 2008. A design science research methodology for information systems research. Journal of Management Information Systems 24:45–77.
Rassenhövel, S., and H. Dyckhoff. 2006. Die Relevanz von Drittmittelindikatoren bei der Beurteilung der Forschungsleistung im Hochschulbereich. In Fortschritt in den Wirtschaftswissenschaften: Wissenschaftstheoretische Grundlagen und exemplarische Anwendungen, ed. S. Zelewski, N. Akca, 85–112. Wiesbaden: Gabler.
Schaefer, J., and M. Clermont. 2018. Stochastic non-smooth envelopment of data for multi-dimensional output. Journal of Productivity Analysis 50:139–154.
Schrader, U., and T. Hennig-Thurau. 2009. VHB-JOURQUAL2: Method, results, and implication of the German academic association for business research’s journal ranking. Business Research 2:180–204.
Simar, L. 2003. Detecting outliers in frontier models: a simple approach. Journal of Productivity Analysis 20:391–424.
Simar, L., and P.W. Wilson. 1998. Sensitivity analysis of efficiency scores: how to bootstrap in nonparametric frontier models. Management Science 44:49–61.
Smirlis, Y.G., and D.K. Despotis. 2012. Relaxing the impact of extreme units in data envelopment analysis. International Journal of Information Technology & Decision Making 11:893–907.
Sousa, M.D.C.S.D., and B. Stošić. 2005. Technical efficiency of the Brazilian municipalities: correcting nonparametric frontier measurements for outliers. Journal of Productivity Analysis 24:157–181.
Speklé, R.F., and F.H.M. Verbeeten. 2014. The use of performance measurement systems in the public sector: effects on performance. Management Accounting Research 25:131–146.
Stolz, I., D.D. Hendel, and A.S. Horn. 2010. Ranking of rankings: Benchmarking twenty-five higher education ranking systems in Europe. Higher Education 60:507–528.
Thanassoulis, E. 1999. Setting achievement targets for school children. Education Economics 7:101–119.
Thanassoulis, E., M. Kortelainen, G. Johnes, and J. Johnes. 2011. Costs and efficiency of higher education institutions in England: a DEA analysis. Journal of the Operational Research Society 62:1282–1297.
Tone, K. 2001. A slacks-based measure of efficiency in data envelopment analysis. European Journal of Operational Research 130:498–509.
Tran, N.A., G. Shively, and P. Preckel. 2008. A new method for detecting outliers in data envelopment analysis. Applied Economics Letters 17:313–316.
Tunger, D., M. Clermont, and A. Meier. 2018. Altmetrics: state of the art and a look into the future. In Scientometrics, ed. M. Jibu, Y. Osabe, 123–134. London: IntechOpen.
Usher, A., and M. Savino. 2006. A world of difference: a global survey of university league tables. Toronto: Educational Policy Institute.
Wilson, P.W. 1995. Detecting influential observations in data envelopment analysis. Journal of Productivity Analysis 6:27–45.
de Witte, K., and R.C. Marques. 2010. Influential observations in frontier models, a robust non-oriented approach to the water sector. Annals of Operations Research 181:377–392.
Wojcik, V., H. Dyckhoff, and M. Clermont. 2018. Is Data Envelopment Analysis a suitable tool for performance measurement and benchmarking in non-production contexts? Business Research. https://doi.org/10.1007/s40685-018-0077-z.
Worthington, A.C., and B.L. Lee. 2008. Efficiency, technology and productivity change in Australian universities: 1998–2003. Economics of Education Review 27:285–298.
Yang, Z., X. Wang, and D. Sun. 2010. Using the bootstrap method to dectect influential DMUs in Data Envelopment Analysis. Annals of Operations Resarch 173:89–103.
Zhu, J. 2014. Quantitative models for performance evaluation and benchmarking, Data Envelopment Analysis with spreadsheets, 3rd edn., Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical standards
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
Not applicable, since there are no individual participants included in the study.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Clermont, M., Schaefer, J. Identification of Outliers in Data Envelopment Analysis. Schmalenbach Bus Rev 71, 475–496 (2019). https://doi.org/10.1007/s41464-019-00078-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41464-019-00078-7
Keywords
- Data Envelopment Analysis
- Outlier detection
- Efficiency analysis
- Performance measurement
- Cluster analysis