当前位置: X-MOL 学术Data Knowl. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrated detection and localization of concept drifts in process mining with batch and stream trace clustering support
Data & Knowledge Engineering ( IF 2.5 ) Pub Date : 2023-12-02 , DOI: 10.1016/j.datak.2023.102253
Rafael Gaspar de Sousa , Antonio Carlos Meira Neto , Marcelo Fantinato , Sarajane Marques Peres , Hajo Alexander Reijers

Process mining can help organizations by extracting knowledge from event logs. However, process mining techniques often assume business processes are stationary, while actual business processes are constantly subject to change because of the complexity of organizations and their external environment. Thus, addressing process changes over time – known as concept drifts – allows for a better understanding of process behavior and can provide a competitive edge for organizations, especially in an online data stream scenario. Current approaches to handling process concept drift focus primarily on detecting and locating concept drifts, often through an integrated, albeit offline, approach. However, part of these integrated approaches rely on complex data structures related to tree-based process models, usually discovered through algorithms whose results are influenced by specific heuristic rules. Moreover, most of the proposed approaches have not been tested on public true concept drift-labeled event logs commonly used as benchmark, making comparative analysis difficult. In this article, we propose an online approach to detect and localize concept drifts in an integrated way using batch and stream trace clustering support. In our approach, cluster models provide input information for both concept drift detection and localization methods. Each cluster abstracts a behavior profile underlying the process and reveals descriptive information about the discovered concept drifts. Experiments with benchmark synthetic event logs with different control-flow changes, as well as with real-world event logs, showed that our approach, when relying on the same clustering model, is competitive in relation to baselines concept drift detection method. In addition, the experiment showed our approach is able to correctly locate the concept drifts detected and allows the analysis of such concept drifts through different process behavior profiles.



中文翻译:


通过批处理和流跟踪聚类支持对流程挖掘中的概念漂移进行集成检测和定位



流程挖掘可以通过从事件日志中提取知识来帮助组织。然而,流程挖掘技术通常假设业务流程是固定的,而实际业务流程由于组织及其外部环境的复杂性而不断变化。因此,解决流程随时间的变化(称为概念漂移)可以更好地理解流程行为,并可以为组织提供竞争优势,尤其是在在线数据流场景中。当前处理流程概念漂移的方法主要集中于检测和定位概念漂移,通常通过集成的(尽管是离线的)方法。然而,这些集成方法的一部分依赖于与基于树的过程模型相关的复杂数据结构,这些模型通常通过算法发现,其结果受特定启发式规则的影响。此外,大多数提出的方法尚未在通常用作基准的公共真实概念漂移标记事件日志上进行测试,这使得比较分析变得困难。在本文中,我们提出了一种在线方法,使用批处理和流跟踪聚类支持以集成方式检测和定位概念漂移。在我们的方法中,聚类模型为概念漂移检测和定位方法提供输入信息。每个集群都抽象了流程背后的行为概况,并揭示了有关所发现的概念漂移的描述性信息。对具有不同控制流变化的基准合成事件日志以及真实事件日志的实验表明,当依赖相同的聚类模型时,我们的方法相对于基线概念漂移检测方法具有竞争力。 此外,实验表明我们的方法能够正确定位检测到的概念漂移,并允许通过不同的过程行为概况分析此类概念漂移。

更新日期:2023-12-02
down
wechat
bug