当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sampling business process event logs using graph‐based ranking model
Concurrency and Computation: Practice and Experience ( IF 2 ) Pub Date : 2020-10-02 , DOI: 10.1002/cpe.5974
Cong Liu 1 , Yulong Pei 2 , Long Cheng 3 , Qingtian Zeng 4 , Hua Duan 4
Affiliation  

Modern information systems are continuously collecting and storing large volumes of business process event logs. The analysis of event logs can provide valuable insights for business process re‐engineering and enhancement. Process discovery, as one of the most challenging event log analysis techniques, aims to discover a business process model from an event log. Many process discovery approaches have been proposed in the past two decades, however, most of them suffer from efficiency problem when dealing with large‐scale event logs. Motivated by PageRank, we propose LogRank, a graph‐based ranking model, for event log sampling in this paper. The LogRank is capable of sampling a large‐scale event log to a smaller size that can be efficiently handled by existing discovery approaches. To support real‐life applications, we instantiate the LogRank model for two typical types of event logs, that is, simple event logs and lifecycle event logs. To quantify the quality of a sample log with respect to the original one, we introduce a general evaluation framework that can be instantiated for different quality metrics. The proposed sampling approach has been implemented in the open‐source process mining toolkit ProM. By experiments with both synthetic and real‐life event logs, we demonstrate that the proposed LogRank‐based sampling approach provides an effective means to improve process discovery efficiency as well as guaranteeing high quality of discovered models.

中文翻译:

使用基于图的排名模型对业务流程事件日志进行采样

现代信息系统不断收集和存储大量业务流程事件日志。事件日志的分析可以为业务流程的重新设计和增强提供有价值的见解。流程发现是最具挑战性的事件日志分析技术之一,旨在从事件日志中发现业务流程模型。在过去的二十年中,已经提出了许多过程发现方法,但是,大多数方法在处理大规模事件日志时都会遇到效率问题。受PageRank的启发,我们在本文中建议使用LogRank(基于图的排名模型)进行事件日志采样。该数排序能够将大型事件日志采样到较小的大小,而现有的发现方法可以有效地对其进行处理。为了支持现实生活中的应用程序,我们为两种典型的事件日志实例化LogRank模型,即简单事件日志和生命周期事件日志。为了量化样本日志相对于原始日志的质量,我们引入了一个通用评估框架,该框架可以针对不同的质量指标进行实例化。提议的采样方法已在开源过程挖掘工具包ProM中实现。通过综合和真实事件日志的实验,我们证明了拟议的LogRank基于抽样的方法提供了一种有效的手段,可以提高过程发现效率,并保证发现模型的高质量。
更新日期:2020-10-02
down
wechat
bug