Skip to main content
Log in

Identification of Outliers in Data Envelopment Analysis

An Approach Using Structure-detecting Statistical Procedures

  • Original Article
  • Published:
Schmalenbach Business Review Aims and scope

Abstract

Data Envelopment Analysis (DEA) is a deterministic method for the aggregation of multidimensional measures and subsequent efficiency analysis. Due to its inherent determinism, however, it reacts sensitively to outliers in datasets. Existing methods for identifying such outliers have two main disadvantages. First, from a more conceptional point of view, a uniform definition of an outlier is missing. Second, there are technical disadvantages of each method. For instance, arbitrarily limited values have to be set by the user, like the amount of efficiency value from which on a decision making unit is regarded as an outlier. This paper initially presents a definition of outliers, which explicitly takes the specifics of DEA into account. Based on this definition, an approach for identifying outliers in DEA is introduced which explicitly tackles the technical disadvantages and takes them into account in the developed algorithm. The plausibility of this approach is validated on the basis of empirical examples from performance measurement at the university level.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. It should be mentioned that there are also models which extend DEA with stochastic models (e.g. Jradi and Ruggiero 2019). However, such kind of combined models are not part of our paper.

  2. For a detailed overview of DEA applications see, e.g., Liu et al. (2013).

  3. Liu et al. (2019) recently propose a new cross-efficiency evaluation based on prospect theory and provide an empirical example regarding selected Chines universities.

  4. In general, the types of outliers detailed in the following may also occur in the case of a linear data envelope; however, this cannot be represented in a two-dimensional example, why a convex envelope has been selected here.

  5. A DMU is called weak efficient if the DMU can not be improved in at least one of the considered inputs or outputs, but there is still room for improvement in at least one of the other inputs and outputs.

  6. In exceptional cases, there may be no super-efficiency value for both input orientation and output orientation. For instance, in the case of output orientation, the DMU realizes an extremely high output with extremely little input, and is then directly classified as an outlier.

  7. Recently, Agasisti et al. (2019) combine DEA with mehods of Multi Criteria Evaluation to measure and evaluate the efficiency of European education systems.

  8. The data surveys attract great interest within the scientific world (e.g. Usher and Savino 2006; Frey 2007; Marginson and van der Welde 2007; Jarwal et al. 2009; Stolz et al. 2010; Kieser 2012). For a comprehensive, critical discussion of the CHE approach, especially from the perspective of German business administration, see Clermont and Dirksen (2016).

References

  • Agasisti, T., G. Munda, and R. Hippe. 2019. Measuring the efficiency of European education systems by combining data envelopment analysis and multiple-criteria evaluation. Journal of Productivity Analysis 51:105–124.

    Article  Google Scholar 

  • Agasiti, T., and C. Pérez-Esparrells. 2010. Comparing efficiency in a cross-country perspective: the case of Italian and Spanish state universities. Higher Education 59:85–103.

    Article  Google Scholar 

  • Aigner, D., C.A.K. Lovell, and P. Schmidt. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics 6:21–37.

    Article  Google Scholar 

  • Albers, S. 2015. What drives publication productivity in German business faculties? Publication Productivity 67:6–33.

    Google Scholar 

  • Andersen, P., and N.C. Petersen. 1993. A procedure for ranking efficient units in Data Envelopment Analysis. Management Science 39:1261–1264.

    Article  Google Scholar 

  • Aragon, Y., A. Daouia, and C. Thomas-Agnan. 2005. Nonparametric frontier estimation: a conditional quantile-based approach. Econometric Theory 21:358–389.

    Article  Google Scholar 

  • Avkiran, N.K. 2001. Investigating technical and scale efficiencies of Australian universities through data envelopment analysis. Socio-Economic Planning Sciences 35:57–80.

    Article  Google Scholar 

  • Bahari, A.R., and A. Emrouznejad. 2014. Influential DMUs and outlier detection in Data Envelopment Analysis with an application to health care. Annals of Operations Research 223:95–108.

    Article  Google Scholar 

  • Ball, R., B. Mittermaier, and D. Tunger. 2009. Creation of journal-based publication profiles of scientific institutions: a methodology for the interdisciplinary comparison of scientific researcher based on the J‑factor. Scientometrics 81:381–392.

    Article  Google Scholar 

  • Banker, R.D., and H. Chang. 2006. The super-efficiency procedure for outlier identification, not for ranking efficient units. European Journal of Operational Research 175:1311–1320.

    Article  Google Scholar 

  • Banker, R.D., A. Charnes, and W.W. Cooper. 1984. Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science 30:1078–1092.

    Article  Google Scholar 

  • Banker, R.D., S. Das, and S.M. Datar. 1989. Analysis of cost variances for management control in hospitals. Research in Governmental and Nonprofit Accounting 5:269–291.

    Google Scholar 

  • Barnett, V., and T. Lewis. 1984. Outlier in statistical data, 2nd edn., Chichester: John Wiley & Sons.

    Google Scholar 

  • Ben-Gal, I. 2005. Outlier detection. In Data mining and knowledge discovery handbook, 2nd edn., ed. O. Maimon, L. Rokach, 117–130. New York: Springer.

    Google Scholar 

  • Bogetoft, P., and L. Otto. 2011. Benchmarking with DEA, SFA, and R. New York: Springer.

    Book  Google Scholar 

  • Bolli, T., and M. Farsi. 2015. The dynamics of productivity in Swiss universities. Journal of Productivity Analysis 44:21–38.

    Article  Google Scholar 

  • Bonesrønning, H., and J. Rattsø. 1994. Efficiency variation among the Norwegian high schools: consequences of equalization policy. Economics of Education Review 13:289–304.

    Article  Google Scholar 

  • Bourne, M., A. Neely, J. Mills, and K. Platts. 2003. Implementing performance measurement systems: a literature overview. International Journal of Business Performance Management 5:1–24.

    Article  Google Scholar 

  • Calinski, T., and J. Harabasz. 1974. A dendrite method for cluster analysis. Communications in Statistics: Theory and Methods 3:1–27.

    Google Scholar 

  • Cazals, C., J.-P. Florens, and L. Simar. 2002. Nonparametric frontier estimation: a robust approach. Journal of Econometrics 106:1–25.

    Article  Google Scholar 

  • Charnes, A., W.W. Cooper, B. Golany, L. Seiford, and J. Stutz. 1985. Foundations of data envelopment analysis for Pareto-Koopmans efficient empirical production functions. Journal of Econometrics 30:91–107.

    Article  Google Scholar 

  • Charnes, A., W.W. Cooper, and E. Rhodes. 1978. Measuring the efficiency of decision making units. European Journal of Operational Research 2:429–444.

    Article  Google Scholar 

  • Clermont, M. 2016. Effectiveness and efficiency of research in Germany over time: an analysis of German business schools between 2001 and 2009. Scientometrics 108:1347–1381.

    Article  Google Scholar 

  • Clermont, M., and A. Dirksen. 2016. The measurement, evaluation, and publication of performance in higher education: An analysis of the CHE research ranking of business schools in Germany from an accounting perspective. Public Administration Quarterly 40:341–386.

    Google Scholar 

  • Clermont, M., A. Dirksen, and H. Dyckhoff. 2015. Returns to scale of business administration research in Germany. Scientometrics 103:583–614.

    Article  Google Scholar 

  • Daghbashyan, Z., E. Deiaco, and M. McKelvey. 2014. How and why does cost efficiency of universities differ across European countries? An explorative attempt using new microdata. In Knowledge, diversity and performance in European higher education: A changing landscape, ed. A. Bonaccorsi, 267–291. Cheltenham & Northampton: Edward Elgar.

    Google Scholar 

  • Dellnitz, A. 2016. RTS-mavericks in Data Envelopment Analysis. Operations Research Letters 44:622–624.

    Article  Google Scholar 

  • Dilger, A., and H. Müller. 2012. Ein Forschungsleistungsranking auf der Grundlage von Google Scholar. Zeitschrift für Betriebswirtschaft 82:1089–1105.

    Article  Google Scholar 

  • Doyle, J., and R. Green. 1994. Efficiency and cross-efficiency in DEA: derivations, meanings and use. Journal of the Operational Research Society 45:567–578.

    Article  Google Scholar 

  • Dyckhoff, H., and H. Ahn. 2010. Verallgemeinerte DEA-Modelle zur Performanceanalyse. Zeitschrift für Betriebswirtschaft 80:1249–1276.

    Article  Google Scholar 

  • Dyckhoff, H., and K. Allen. 1999. Theoretische Begründung einer Effizienzanalyse mittels Data Envelopment Analysis (DEA). Zeitschrift für betriebswirtschaftliche Forschung 51:411–436.

    Article  Google Scholar 

  • Dyckhoff, H., H. Ahn, S. Rassenhövel, and K. Sandfort. 2008. Skalenerträge der Forschung wirtschaftswissenschaftlicher Fachbereiche: Empirische Ergebnisse und ihre Interpretation. Hochschulmanagement 3:62–66.

    Google Scholar 

  • Dyckhoff, H., S. Rassenhövel, and K. Sandfort. 2009. Empirische Produktionsfunktion betriebswirtschaftlicher Forschung: Eine Analyse der Daten des Centrums für Hochschulentwicklung. Zeitschrift für betriebswirtschaftliche Forschung 61:22–56.

    Article  Google Scholar 

  • Emrouznejad, A., and G.-L. Yang. 2018. A survey and analysis of the first 40 years of scholarly literatur in DEA: 1978–2016. Socio-Economic Planning Sciences 61:4–8.

    Article  Google Scholar 

  • Fandel, G. 2007. On the performance of universities in North-Rhine-Westphalia, Germany: Government’s redistribution of funds judged using DEA efficiency measures. European Journal of Operational Research 176:521–533.

    Article  Google Scholar 

  • Färe, R., S. Grosskopf, B. Lindgren, and P. Roos. 1994. Productivity development in Swedish hospitals: a Malmquist output index approach. In Data envelopment analysis: theory, methodology, and applications, ed. A. Charnes, W.W. Cooper, A.Y. Lewin, and L.M. Seiford, 253–272. New York: Springer.

    Chapter  Google Scholar 

  • Färe, R., S. Grosskopf, and W. Weber. 1989. Measuring school district performance. Public Finance Review 17:409–428.

    Article  Google Scholar 

  • Flegg, A.T., and D.O. Allen. 2007. Does expansion cause congestion? The case of the older British universities, 1994–2004. Education Economics 15:75–102.

    Article  Google Scholar 

  • Fouchet, R., and M. Guenoun. 2007. Performance management in intermunicipal authorities. International Journal of Public Sector Performance Management 1:62–82.

    Google Scholar 

  • Franco-Santos, M., L. Lucianetti, and M. Bourne. 2012. Contemporary performance measurement systems: a review of their consequences and a framework for research. Management Accounting Research 23:79–119.

    Article  Google Scholar 

  • Frey, B.S. 2007. Evaluierungen, Evaluierungen… Evaluitis. Perspektiven der Wirtschaftspolitik 8:207–220.

    Article  Google Scholar 

  • García-Aracil, A. 2013. Understanding productivity changes in public universities: evidence from Spain. Research Evaluation 22:351–368.

    Article  Google Scholar 

  • Golany, B., and Y. Roll. 1989. An application procedure for DEA. Omega 17:237–250.

    Article  Google Scholar 

  • Halkos, G.E., and N.G. Tzeremes. 2011. Measuring economic journals’ citation efficiency: a data envelopment analysis approach. Scientometrics 88:979–1001.

    Article  Google Scholar 

  • Hammerschmidt, M., R. Wilken, and M. Staat. 2009. Methoden zur Lösung grundlegender Probleme der Datenqualität in DEA-basierten Effizienzanalysen. Die Betriebswirtschaft 69:289–309.

    Google Scholar 

  • Hawkins, D. 1980. Identification of outliers. London: Chapman and Hall.

    Book  Google Scholar 

  • Horne, J., and B. Hu. 2008. Estimation of cost efficiency of Australian universities. Mathematics and Computers in Simulation 78:266–275.

    Article  Google Scholar 

  • Hosseinzadeh Lotfi, F., G.R. Jahanshahloo, M. Khodabakhshi, M. Rostamy-Malkhlifeh, Z. Moghaddas, and M. Vaez-Ghasemi. 2013. A review of ranking Models in data envelopment analysis. Journal of Applied Mathematics https://doi.org/10.1155/2013/492421.

    Google Scholar 

  • Jarwal, S.D., A.M. Brion, and M.L. King. 2009. Measuring research quality using the journal impact factor, citations and ‘ranked journals’: blunt instruments or inspired metrics? Journal of Higher Education Policy & Management 31:289–300.

    Article  Google Scholar 

  • Johnes, J. 2006. Data envelopment analysis and its application to the measurement of efficiency in higher education. Economics of Education Review 25:273–288.

    Article  Google Scholar 

  • Johnson, A.L., and L.F. McGinnis. 2009. The hyperbolic oriented efficiency measure as a remedy to infeasibility of super efficiency models. Journal of the Operational Research Society 60:1511–1517.

    Article  Google Scholar 

  • Jradi, S., and J. Ruggiero. 2019. Stochastic data envelopment analysis: a quantile regression approach to estimate the production function. European Journal of Operational Research 278:385–393.

    Article  Google Scholar 

  • Keeney, R.L., K.E. See, and D. von Winterfeldt. 2006. Evaluating academic programs: with applications to U.S. graduate decision science programs. Operations Research 54:813–828.

    Article  Google Scholar 

  • Kerpen, P. 2016. Praxisorientierte Data Envelopment Analysis. Wiesbaden: Springer.

    Book  Google Scholar 

  • Kieser, A. 2012. JOURQUAL: Der Gebrauch, nicht der Missbrauch, ist das Problem. Oder: Warum Wirtschaftsinformatik die beste deutschsprachige betriebswirtschaftliche Zeitschrift ist. Die Betriebswirtschaft 72:93–110.

    Google Scholar 

  • Lampe, H.W., and D. Hilgers. 2015. Trajectories of efficiency measurement: a bibliometric analysis of DEA and SFA. European Journal of Operational Research 240:1–21.

    Article  Google Scholar 

  • Lisi, I.E. 2015. Translating environmental motivations into performance: the role of environmental performance measurement systems. Management Accounting Research 29:27–44.

    Article  Google Scholar 

  • Liu, H.-H., Y.-Y. Song, and G.-L. Yang. 2019. Cross-efficiency evaluation in data envelopment analysis based on prospect theory. European Journal of Operational Research 273:364–375.

    Article  Google Scholar 

  • Liu, J.S., L.Y. Lu, W.M. Lu, and B.J. Lin. 2013. A survey of DEA applications. Omega 40:893–902.

    Article  Google Scholar 

  • Marginson, S., and M. van der Welde. 2007. To rank or to be ranked: the impact of global rankings in higher education. Journal of Studies in International Education 11:206–329.

    Google Scholar 

  • Maugeri, S., and J.L. Metzger. 2013. Public action: a question of performance? International Journal of Public Sector Performance Management 2:105–122.

    Article  Google Scholar 

  • Maulik, U., and S. Bandyopadhyay. 2002. Performance evaluation of some clustering algorithms and validity indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24:1650–1654.

    Article  Google Scholar 

  • Meeusen, W., and J. van den Broeck. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review 18:435–444.

    Article  Google Scholar 

  • Milligan, G.W., and M.C. Cooper. 1985. An examination of procedures for determining the number of clusters in a data set. Psychometrika 50:159–179.

    Article  Google Scholar 

  • Olivares, M., and A. Schenker-Wicki. 2012. The dynamics of productivity in the Swiss and German university sector: a non-parametric analysis that accounts for heterogeneous production. Zürich: University of Zurich.

    Google Scholar 

  • Ondrich, J., and J. Ruggiero. 2002. Outlier detection in data envelopment analysis: an analysis of jackknifing. Journal of the Operational Research Society 53:342–346.

    Article  Google Scholar 

  • Peffers, K., T. Tuunanen, M.A. Rothenberger, and S. Chatterjee. 2008. A design science research methodology for information systems research. Journal of Management Information Systems 24:45–77.

    Article  Google Scholar 

  • Rassenhövel, S., and H. Dyckhoff. 2006. Die Relevanz von Drittmittelindikatoren bei der Beurteilung der Forschungsleistung im Hochschulbereich. In Fortschritt in den Wirtschaftswissenschaften: Wissenschaftstheoretische Grundlagen und exemplarische Anwendungen, ed. S. Zelewski, N. Akca, 85–112. Wiesbaden: Gabler.

    Chapter  Google Scholar 

  • Schaefer, J., and M. Clermont. 2018. Stochastic non-smooth envelopment of data for multi-dimensional output. Journal of Productivity Analysis 50:139–154.

    Article  Google Scholar 

  • Schrader, U., and T. Hennig-Thurau. 2009. VHB-JOURQUAL2: Method, results, and implication of the German academic association for business research’s journal ranking. Business Research 2:180–204.

    Article  Google Scholar 

  • Simar, L. 2003. Detecting outliers in frontier models: a simple approach. Journal of Productivity Analysis 20:391–424.

    Article  Google Scholar 

  • Simar, L., and P.W. Wilson. 1998. Sensitivity analysis of efficiency scores: how to bootstrap in nonparametric frontier models. Management Science 44:49–61.

    Article  Google Scholar 

  • Smirlis, Y.G., and D.K. Despotis. 2012. Relaxing the impact of extreme units in data envelopment analysis. International Journal of Information Technology & Decision Making 11:893–907.

    Article  Google Scholar 

  • Sousa, M.D.C.S.D., and B. Stošić. 2005. Technical efficiency of the Brazilian municipalities: correcting nonparametric frontier measurements for outliers. Journal of Productivity Analysis 24:157–181.

    Article  Google Scholar 

  • Speklé, R.F., and F.H.M. Verbeeten. 2014. The use of performance measurement systems in the public sector: effects on performance. Management Accounting Research 25:131–146.

    Article  Google Scholar 

  • Stolz, I., D.D. Hendel, and A.S. Horn. 2010. Ranking of rankings: Benchmarking twenty-five higher education ranking systems in Europe. Higher Education 60:507–528.

    Article  Google Scholar 

  • Thanassoulis, E. 1999. Setting achievement targets for school children. Education Economics 7:101–119.

    Article  Google Scholar 

  • Thanassoulis, E., M. Kortelainen, G. Johnes, and J. Johnes. 2011. Costs and efficiency of higher education institutions in England: a DEA analysis. Journal of the Operational Research Society 62:1282–1297.

    Article  Google Scholar 

  • Tone, K. 2001. A slacks-based measure of efficiency in data envelopment analysis. European Journal of Operational Research 130:498–509.

    Article  Google Scholar 

  • Tran, N.A., G. Shively, and P. Preckel. 2008. A new method for detecting outliers in data envelopment analysis. Applied Economics Letters 17:313–316.

    Article  Google Scholar 

  • Tunger, D., M. Clermont, and A. Meier. 2018. Altmetrics: state of the art and a look into the future. In Scientometrics, ed. M. Jibu, Y. Osabe, 123–134. London: IntechOpen.

    Google Scholar 

  • Usher, A., and M. Savino. 2006. A world of difference: a global survey of university league tables. Toronto: Educational Policy Institute.

    Google Scholar 

  • Wilson, P.W. 1995. Detecting influential observations in data envelopment analysis. Journal of Productivity Analysis 6:27–45.

    Article  Google Scholar 

  • de Witte, K., and R.C. Marques. 2010. Influential observations in frontier models, a robust non-oriented approach to the water sector. Annals of Operations Research 181:377–392.

    Article  Google Scholar 

  • Wojcik, V., H. Dyckhoff, and M. Clermont. 2018. Is Data Envelopment Analysis a suitable tool for performance measurement and benchmarking in non-production contexts? Business Research. https://doi.org/10.1007/s40685-018-0077-z.

    Google Scholar 

  • Worthington, A.C., and B.L. Lee. 2008. Efficiency, technology and productivity change in Australian universities: 1998–2003. Economics of Education Review 27:285–298.

    Article  Google Scholar 

  • Yang, Z., X. Wang, and D. Sun. 2010. Using the bootstrap method to dectect influential DMUs in Data Envelopment Analysis. Annals of Operations Resarch 173:89–103.

    Article  Google Scholar 

  • Zhu, J. 2014. Quantitative models for performance evaluation and benchmarking, Data Envelopment Analysis with spreadsheets, 3rd edn., Berlin: Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcel Clermont.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical standards

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Not applicable, since there are no individual participants included in the study.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Clermont, M., Schaefer, J. Identification of Outliers in Data Envelopment Analysis. Schmalenbach Bus Rev 71, 475–496 (2019). https://doi.org/10.1007/s41464-019-00078-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41464-019-00078-7

Keywords

JEL Classification

Navigation