当前位置: X-MOL 学术IEEE Trans. Reliab. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Stress Testing With Influencing Factors to Accelerate Data Race Software Failures
IEEE Transactions on Reliability ( IF 5.0 ) Pub Date : 2020-03-01 , DOI: 10.1109/tr.2019.2895052
Kun Qiu , Zheng Zheng , Kishor S. Trivedi , Beibei Yin

Software failures caused by data race bugs have always been major concerns in parallel and distributed systems, despite significant efforts spent in software testing. Due to their nondeterministic and hard-to-reproduce features, when evaluating systems’ operational reliability, a rather long period of experimental execution time is expected to be spent on observing failures caused by data race conditions. To address this problem, in this paper, we make two contributions. First, this paper proposes stress testing with influencing factors, in which the system runs under certain workloads for a long time with controlled stress conditions to accelerate the occurrence of data race failures. Second, it explores and formulates mathematical relationship models between data races’ statistical characteristics of time to failure (TTF) or mean TTF (MTTF) and the influencing factors. Such relationship models are used for TTF/MTTF extrapolation under different operational conditions and are essential to reduce systems’ reliability evaluation time. The proposed method is empirically evaluated on six applications suffering from failures caused by real-world data race bugs. Through analysis of the experimental results, we obtain several important findings: First, the reduction in the manifestation time to data race failures achieved by controlling the influencing factors is statistically significant. Second, Power model is the best-fitting model of the relationship between the MTTF and the influencing factors. Third, Power Weibull distribution is the best-fitting probability distribution between the TTF and the influencing factors. Finally, the TTF/MTTF can be accurately estimated with the approach proposed in this paper.

中文翻译:

使用影响因素进行压力测试以加速数据竞赛软件故障

尽管在软件测试方面付出了巨大的努力,但由数据竞争错误引起的软件故障一直是并行和分布式系统中的主要问题。由于其不确定性和难以重现的特性,在评估系统的运行可靠性时,预计将花费相当长的实验执行时间来观察数据竞争条件引起的故障。为了解决这个问题,在本文中,我们做出了两个贡献。首先,本文提出了带影响因素的压力测试,系统在一定的工作负载下长时间运行,压力条件可控,加速数据竞争故障的发生。第二,它探索并制定了数据竞赛的故障时间(TTF)或平均TTF(MTTF)统计特征与影响因素之间的数学关系模型。这种关系模型用于不同操作条件下的 TTF/MTTF 外推,对于减少系统的可靠性评估时间至关重要。所提出的方法对六个应用程序进行了经验评估,这些应用程序遭受由现实世界数据竞争错误引起的故障。通过对实验结果的分析,我们得到了几个重要的发现:首先,通过控制影响因素实现的数据竞争失败的表现时间的减少在统计上是显着的。其次,Power模型是MTTF与影响因素关系的最佳拟合模型。第三,Power Weibull 分布是 TTF 与影响因素之间的最佳拟合概率分布。最后,使用本文提出的方法可以准确估计 TTF/MTTF。
更新日期:2020-03-01
down
wechat
bug