当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Causality-Guided Adaptive Interventional Debugging
arXiv - CS - Databases Pub Date : 2020-03-21 , DOI: arxiv-2003.09539
Anna Fariha, Suman Nath, Alexandra Meliou

Runtime nondeterminism is a fact of life in modern database applications. Previous research has shown that nondeterminism can cause applications to intermittently crash, become unresponsive, or experience data corruption. We propose Adaptive Interventional Debugging (AID) for debugging such intermittent failures. AID combines existing statistical debugging, causal analysis, fault injection, and group testing techniques in a novel way to (1) pinpoint the root cause of an application's intermittent failure and (2) generate an explanation of how the root cause triggers the failure. AID works by first identifying a set of runtime behaviors (called predicates) that are strongly correlated to the failure. It then utilizes temporal properties of the predicates to (over)-approximate their causal relationships. Finally, it uses fault injection to execute a sequence of interventions on the predicates and discover their true causal relationships. This enables AID to identify the true root cause and its causal relationship to the failure. We theoretically analyze how fast AID can converge to the identification. We evaluate AID with six real-world applications that intermittently fail under specific inputs. In each case, AID was able to identify the root cause and explain how the root cause triggered the failure, much faster than group testing and more precisely than statistical debugging. We also evaluate AID with many synthetically generated applications with known root causes and confirm that the benefits also hold for them.

中文翻译:

因果引导的自适应干预调试

运行时不确定性是现代数据库应用程序中的一个现实。先前的研究表明,不确定性会导致应用程序间歇性崩溃、无响应或出现数据损坏。我们建议使用自适应介入调试 (AID) 来调试此类间歇性故障。AID 以一种新颖的方式结合了现有的统计调试、因果分析、故障注入和组测试技术,以 (1) 查明应用程序间歇性故障的根本原因,以及 (2) 生成对根本原因如何触发故障的解释。AID 的工作方式是首先识别一组与故障密切相关的运行时行为(称为谓词)。然后它利用谓词的时间属性来(过度)近似它们的因果关系。最后,它使用故障注入对谓词执行一系列干预并发现它们的真实因果关系。这使 AID 能够确定真正的根本原因及其与故障的因果关系。我们从理论上分析了 AID 收敛到识别的速度有多快。我们使用六个在特定输入下间歇性失败的实际应用程序来评估 AID。在每种情况下,AID 都能够识别根本原因并解释根本原因如何触发故障,这比组测试快得多,而且比统计调试更准确。我们还使用许多具有已知根本原因的综合生成的应用程序来评估 AID,并确认它们也有好处。这使 AID 能够确定真正的根本原因及其与故障的因果关系。我们从理论上分析了 AID 收敛到识别的速度有多快。我们使用六个在特定输入下间歇性失败的实际应用程序来评估 AID。在每种情况下,AID 都能够识别根本原因并解释根本原因如何触发故障,这比组测试快得多,而且比统计调试更准确。我们还使用许多具有已知根本原因的综合生成的应用程序来评估 AID,并确认它们也有好处。这使 AID 能够确定真正的根本原因及其与故障的因果关系。我们从理论上分析了 AID 收敛到识别的速度有多快。我们使用六个在特定输入下间歇性失败的实际应用程序来评估 AID。在每种情况下,AID 都能够识别根本原因并解释根本原因如何触发故障,这比组测试快得多,而且比统计调试更准确。我们还使用许多具有已知根本原因的综合生成的应用程序来评估 AID,并确认它们也有好处。AID 能够识别根本原因并解释根本原因如何触发失败,这比组测试快得多,也比统计调试更准确。我们还使用许多具有已知根本原因的综合生成的应用程序来评估 AID,并确认它们也有好处。AID 能够识别根本原因并解释根本原因如何触发失败,这比组测试快得多,也比统计调试更准确。我们还使用许多具有已知根本原因的综合生成的应用程序来评估 AID,并确认它们也有好处。
更新日期:2020-04-13
down
wechat
bug