Skip to main content
Log in

Towards Confirmatory Process Discovery: Making Assertions About the Underlying System

  • Research Paper
  • Published:
Business & Information Systems Engineering Aims and scope Submit manuscript

Abstract

The focus in the field of process mining, and process discovery in particular, has thus far been on exploring and describing event data by the means of models. Since the obtained models are often directly based on a sample of event data, the question whether they also apply to the real process typically remains unanswered. As the underlying process is unknown in real life, there is a need for unbiased estimators to assess the system-quality of a discovered model, and subsequently make assertions about the process. In this paper, an experiment is described and discussed to analyze whether existing fitness, precision and generalization metrics can be used as unbiased estimators of system fitness and system precision. The results show that important biases exist, which makes it currently nearly impossible to objectively measure the ability of a model to represent the system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. In practice, these costs can be configured for each activity type individually, to reflect that certain deviations should be penalized more than others.

  2. Optimal alignments are the alignments for which the cost is minimized.

  3. The types of noise used have been defined based on existing literature (Maruster 2003). However, for future experiments, a more elaborate reasoning for what qualifies as realistic noise is necessary. For example, the swapping of random activities is not really a realistic event. A detailed discussion of what can be regarded as noise is out of the scope of this paper.

References

  • Adriansyah A, Munoz-Gama J, Carmona J, van Dongen BF, van der Aalst WM (2015) Measuring precision of modeled behavior. Inf Syst e-Bus Manag 13(1):37–67

    Article  Google Scholar 

  • Agrawal R, Gunopulos D, Leymann F (1998) Mining process models from workflow logs. In: Schek HJ, Saltor F, Ramos I, Alonso G (eds) Adv Database Technol - EDBT ’98, vol 1377. Springer, Berlin, pp 467–483

    Chapter  Google Scholar 

  • Buijs JCAM (2014) Flexible evolutionary algorithms for mining structured process models. Ph.D. thesis, Technische Universiteit Eindhoven, Eindhoven

  • Buijs JCAM, van Dongen BF, van der Aalst WMP (2012) On the role of fitness, precision, generalization and simplicity in process discovery. In: On the move to meaningful internet systems: OTM 2012, Springer, Berlin, pp 305–322

  • Cook JE, Wolf AL (1995) Automating process discovery through event-data analysis. In: 17th international conference on software engineering, 1995. ICSE 1995, IEEE, pp 73–73

  • Datta A (1998) Automating the discovery of as-is business process models: probabilistic and algorithmic approaches. Inf Syst Res 9(3):275–301

    Article  Google Scholar 

  • Erickson B, Nosanchuk T (1992) Understanding data. McGraw-Hill Education, New York

    Google Scholar 

  • Gelman A (2004) Exploratory data analysis for complex models. J Comput Gr Stat 13(4):755–779

    Article  Google Scholar 

  • Goedertier S, Martens D, Vanthienen J, Baesens B (2009) Robust process discovery with artificial negative events. J Mach Learn Res 10:1305–1340

    Google Scholar 

  • Greco G, Guzzo A, Ponieri L, Sacca D (2006) Discovering expressive process models by clustering log traces. IEEE Trans Knowl Data Eng 18(8):1010–1027

    Article  Google Scholar 

  • Janssenswillen G, Depaire B, Jouck T (2016) Calculating the number of unique paths in a block-structured process model. In: Proceedings of the international workshop on algorithms and theories for the analysis of event data 2016

  • Janssenswillen G, Donders N, Jouck T, Depaire B (2017) A comparative study of existing quality measures for process discovery. Inf Syst 71:1–15

    Article  Google Scholar 

  • Jouck T, Depaire B (Mar 2016) Generating artificial data for empirical analysis of process discovery algorithms: a process tree and log generator. Technical report, Universiteit Hasselt, Hasselt

  • Kunze M, Luebbe A, Weidlich M, Weske M (2011) Towards understanding process modeling-the case of the BPM academic initiative. In: International workshop on business process modeling notation, Springer, Berlin, pp 44–58

  • Leemans SJJ, Fahland D, van der Aalst WMP (2013) Discovering block-structured process models from event logs-a constructive approach. Appl Theory Petri Nets Concurr. Springer, Berlin, pp 311–329

    Chapter  Google Scholar 

  • Maruster L (2003) A machine learning approach to understand business processes. Technische Universiteit Eindhoven

  • de Medeiros AKA, Weijters AJ, van der Aalst WMP (2007) Genetic process mining: an experimental evaluation. Data Min Knowl Discov 14(2):245–304

    Article  Google Scholar 

  • de Medeiros AKA (2006) Genetic process mining. Ph.D. thesis, Technische Universiteit Eindhoven, Eindhoven

  • Muñoz-Gama J, Carmona J (2010) A fresh look at precision in process conformance. In: Business process management. vol 6336, Springer, Hoboken, pp 211–226

  • Rogge-Solti A, Senderovich A, Weidlich M, Mendling J, Gal A (2016) In log and model we trust? In: EMISA, pp 91–94

  • Rozinat A, De Medeiros AA, Günther CW, Weijters A, Van der Aalst WM (2007) Towards an evaluation framework for process mining algorithms, vol 123

  • Rozinat A, van der Aalst WMP (2008) Conformance checking of processes based on monitoring real behavior. Inf Syst 33(1):64–95

    Article  Google Scholar 

  • Tukey JW (1977) Exploratory data analysis, vol 2. Addison-Wesley, Reading, MA

    Google Scholar 

  • Tukey JW, Wilk MB (1966) Data analysis and statistics: an expository overview. In: Proceedings of the November 7-10, 1966, fall joint computer conference, ACM, New York, pp 695–709

  • van der Aalst WMP, Adriansyah A, van Dongen B (2012) Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip Rev Data Min Knowl Discov 2(2):182–192

    Article  Google Scholar 

  • van der Aalst WMP (2013) Mediating between modeled and observed behavior: the quest for the “right” process. In: IEEE international conference on research challenges in information science (RCIS 2013), pp 31–43

  • van der Aalst WMP (2016) Process mining: data science in action. Springer, Berlin

    Book  Google Scholar 

  • van der Werf JME, van Dongen BF, Hurkens CA, Serebrenik A (2008) Process discovery using integer linear programming. In: International conference on applications and theory of petri nets. Springer, Berlin, pp 368–387

  • van Dongen BF, Carmona J, Chatain T (2016) A unified approach for measuring precision and generalization based on anti-alignments. In: International conference on business process management. Springer, Cham

  • vandenBroucke SKLM, DeWeerdt J, Vanthienen Jan B, Baesens B (2014) Determining process model precision and generalization with weighted artificial negative events. IEEE Trans Knowl Data Eng 26(8):1877–1889

    Article  Google Scholar 

  • Weidlich M, Polyvyanyy A, Desai N, Mendling J, Weske M (2011) Process compliance analysis based on behavioural profiles. Inf Syst 36(7):1009–1025

    Article  Google Scholar 

  • Weijters AJMM, van Der Aalst WMP, De Medeiros AKA (2006) Process mining with the heuristics miner-algorithm. Technische Universiteit Eindhoven, Tech. Rep. WP vol 166, pp 1–34

Download references

Acknowledgements

The computational resources and services used in this work for both process discovery and process conformance tasks were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gert Janssenswillen.

Additional information

Accepted after three revisions by Jan Mendling.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Janssenswillen, G., Depaire, B. Towards Confirmatory Process Discovery: Making Assertions About the Underlying System. Bus Inf Syst Eng 61, 713–728 (2019). https://doi.org/10.1007/s12599-018-0567-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12599-018-0567-8

Keywords

Navigation