当前位置: X-MOL 学术ACM Trans. Intell. Syst. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Empirical Investigation of Different Classifiers, Encoding, and Ensemble Schemes for Next Event Prediction Using Business Process Event Logs
ACM Transactions on Intelligent Systems and Technology ( IF 7.2 ) Pub Date : 2020-09-12 , DOI: 10.1145/3406541
Bayu Adhi Tama 1 , Marco Comuzzi 2 , Jonghyeon Ko 2
Affiliation  

There is a growing need for empirical benchmarks that support researchers and practitioners in selecting the best machine learning technique for given prediction tasks. In this article, we consider the next event prediction task in business process predictive monitoring, and we extend our previously published benchmark by studying the impact on the performance of different encoding windows and of using ensemble schemes. The choice of whether to use ensembles and which scheme to use often depends on the type of data and classification task. While there is a general understanding that ensembles perform well in predictive monitoring of business processes, next event prediction is a task for which no other benchmarks involving ensembles are available. The proposed benchmark helps researchers to select a high-performing individual classifier or ensemble scheme given the variability at the case level of the event log under consideration. Experimental results show that choosing an optimal number of events for feature encoding is challenging, resulting in the need to consider each event log individually when selecting an optimal value. Ensemble schemes improve the performance of low-performing classifiers in this task, such as SVM, whereas high-performing classifiers, such as tree-based classifiers, are not better off when ensemble schemes are considered.

中文翻译:

使用业务流程事件日志对下一个事件预测的不同分类器、编码和集成方案进行实证研究

越来越需要经验基准来支持研究人员和从业者为给定的预测任务选择最佳机器学习技术。在本文中,我们考虑业务流程预测监控中的下一个事件预测任务,并通过研究不同编码窗口和使用集成方案对性能的影响来扩展我们之前发布的基准。是否使用集成以及使用哪种方案的选择通常取决于数据类型和分类任务。虽然人们普遍认为集成在业务流程的预测监控中表现良好,但下一个事件预测是一项没有其他涉及集成的基准可用的任务。考虑到正在考虑的事件日志的案例级别的可变性,建议的基准帮助研究人员选择高性能的单个分类器或集成方案。实验结果表明,为特征编码选择最佳事件数具有挑战性,导致在选择最佳值时需要单独考虑每个事件日志。集成方案在此任务中提高了低性能分类器的性能,例如 SVM,而高性能分类器,例如基于树的分类器,在考虑集成方案时并没有更好的表现。导致在选择最佳值时需要单独考虑每个事件日志。集成方案在此任务中提高了低性能分类器的性能,例如 SVM,而高性能分类器,例如基于树的分类器,在考虑集成方案时并没有更好的表现。导致在选择最佳值时需要单独考虑每个事件日志。集成方案在此任务中提高了低性能分类器的性能,例如 SVM,而高性能分类器,例如基于树的分类器,在考虑集成方案时并没有更好的表现。
更新日期:2020-09-12
down
wechat
bug