当前位置: X-MOL 学术Connect. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Discovery of effective infrequent sequences based on maximum probability path
Connection Science ( IF 5.3 ) Pub Date : 2021-07-19 , DOI: 10.1080/09540091.2021.1951667
Ke Lu 1 , Xianwen Fang 1 , Na Fang 1 , Esther Asare 1
Affiliation  

Process discovery usually analyses frequent behaviour in event logs to gain an intuitive understanding of processes. However, there are some effective infrequent behaviours that help to improve business processes in real life. Most existing studies either ignore them or treat them as harmful behaviours. To distinguish effective infrequent sequences from noisy activities, this paper proposes an algorithm to analyse the distribution states of activities and the strong transfer relationships between behaviours based on maximum probability paths. The algorithm divides episodic traces into two categories: harmful and useful episodes, namely noisy activities and effective sequences. First, using conditional probability entropy, the infrequent logs are pre-processed to remove individual noisy activities that are extremely irregularly distributed in the traces. Effective sequences are then extracted from the logs based on the state transfer information of the activities. The algorithm is based on a PM4Py implementation and is validated using synthetic and real logs. From the results, the algorithm not only preserves the key structure of the model and reduces noise activity, but also improves the quality of the model.



中文翻译:

基于最大概率路径的有效不频繁序列发现

流程发现通常分析事件日志中的频繁行为,以获得对流程的直观理解。但是,有一些有效的罕见行为有助于改善现实生活中的业务流程。大多数现有研究要么忽略它们,要么将它们视为有害行为。为了区分有效的不频繁序列和嘈杂的活动,本文提出了一种基于最大概率路径分析活动的分布状态和行为之间的强传递关系的算法。该算法将情节痕迹分为两类:有害情节和有用情节,即噪声活动和有效序列。首先,使用条件概率熵,对不频繁的日志进行预处理,以去除在迹线中分布极不规则的单个噪声活动。然后根据活动的状态转移信息从日志中提取有效序列。该算法基于 PM4Py 实现,并使用合成和真实日志进行验证。从结果来看,该算法不仅保留了模型的关键结构,减少了噪声活动,而且提高了模型的质量。

更新日期:2021-07-19
down
wechat
bug