当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Complex Event Forecasting with Prediction Suffix Trees: Extended Technical Report
arXiv - CS - Databases Pub Date : 2021-09-01 , DOI: arxiv-2109.00287
Elias Alevizos, Alexander Artikis, Georgios Paliouras

Complex Event Recognition (CER) systems have become popular in the past two decades due to their ability to "instantly" detect patterns on real-time streams of events. However, there is a lack of methods for forecasting when a pattern might occur before such an occurrence is actually detected by a CER engine. We present a formal framework that attempts to address the issue of Complex Event Forecasting (CEF). Our framework combines two formalisms: a) symbolic automata which are used to encode complex event patterns; and b) prediction suffix trees which can provide a succinct probabilistic description of an automaton's behavior. We compare our proposed approach against state-of-the-art methods and show its advantage in terms of accuracy and efficiency. In particular, prediction suffix trees, being variable-order Markov models, have the ability to capture long-term dependencies in a stream by remembering only those past sequences that are informative enough. Our experimental results demonstrate the benefits, in terms of accuracy, of being able to capture such long-term dependencies. This is achieved by increasing the order of our model beyond what is possible with full-order Markov models that need to perform an exhaustive enumeration of all possible past sequences of a given order. We also discuss extensively how CEF solutions should be best evaluated on the quality of their forecasts.

中文翻译:

具有预测后缀树的复杂事件预测:扩展技术报告

复杂事件识别 (CER) 系统在过去的二十年中变得流行,因为它们能够“即时”检测实时事件流的模式。但是,在 CER 引擎实际检测到这种情况之前,缺乏预测何时可能出现这种模式的方法。我们提出了一个正式框架,试图解决复杂事件预测 (CEF) 的问题。我们的框架结合了两种形式:a)用于编码复杂事件模式的符号自动机;b) 预测后缀树,它可以提供对自动机行为的简洁概率描述。我们将我们提出的方法与最先进的方法进行比较,并显示其在准确性和效率方面的优势。特别是,预测后缀树,作为变阶马尔可夫模型,通过只记住那些信息量足够大的过去序列,能够捕获流中的长期依赖关系。我们的实验结果证明了能够捕获这种长期依赖性在准确性方面的好处。这是通过将我们模型的阶数增加到超出全阶马尔可夫模型可能的阶数来实现的,全阶马尔可夫模型需要对给定顺序的所有可能的过去序列进行详尽的枚举。我们还广泛讨论了如何最好地评估 CEF 解决方案的预测质量。这是通过将我们模型的阶数增加到超出全阶马尔可夫模型可能的阶数来实现的,全阶马尔可夫模型需要对给定顺序的所有可能的过去序列进行详尽的枚举。我们还广泛讨论了如何最好地评估 CEF 解决方案的预测质量。这是通过将我们模型的阶数增加到超出全阶马尔可夫模型可能的阶数来实现的,全阶马尔可夫模型需要对给定顺序的所有可能的过去序列进行详尽的枚举。我们还广泛讨论了如何最好地评估 CEF 解决方案的预测质量。
更新日期:2021-09-02
down
wechat
bug