当前位置: X-MOL 学术Distrib. Parallel. Databases › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enabling efficient process mining on large data sets: realizing an in-database process mining operator
Distributed and Parallel Databases ( IF 1.5 ) Pub Date : 2019-05-09 , DOI: 10.1007/s10619-019-07270-1
Remco Dijkman , Juntao Gao , Alifah Syamsiyah , Boudewijn van Dongen , Paul Grefen , Arthur ter Hofstede

Process mining can be used to analyze business processes based on logs of their execution. These execution logs are often obtained by querying a database and storing the results in a file. The mining itself is then done on the file, such that the data processing power of the database cannot be used after the log is extracted. Enabling process mining directly on a database therefore provides additional flexibility and efficiency. To help facilitate this, this paper formally defines a database operator that extracts the ‘directly follows’ relation—one of the relations that is at the heart of process mining—from an operational database. It defines the operator using the well-known relational algebra and formally proves equivalence properties of the operator that are useful for query optimization. Subsequently, it presents time-complexity properties of the operator. Finally, it presents an implementation of the operator as part of the H2 DBMS and demonstrates that this implementation extracts the ‘directly follows’ relation from a database with an arbitrary database structure within a fraction of a second; several orders of magnitude faster than is currently possible.

中文翻译:

在大数据集上实现高效的流程挖掘:实现数据库内流程挖掘算子

流程挖掘可用于基于其执行日志来分析业务流程。这些执行日志通常是通过查询数据库并将结果存储在文件中获得的。然后对文件进行挖掘本身,使得提取日志后无法使用数据库的数据处理能力。因此,直接在数据库上启用流程挖掘提供了额外的灵活性和效率。为了帮助实现这一点,本文正式定义了一个数据库运算符,该运算符从操作数据库中提取“直接跟随”关系(流程挖掘的核心关系之一)。它使用众所周知的关系代数定义了运算符,并正式证明了对查询优化有用的运算符的等价性。随后,它呈现了算子的时间复杂度特性。最后,它展示了作为 H2 DBMS 一部分的运算符的实现,并演示了该实现在几分之一秒内从具有任意数据库结构的数据库中提取了“直接跟随”关系;比目前可能的速度快几个数量级。
更新日期:2019-05-09
down
wechat
bug