How Good Is My Test Data? Introducing Safety Analysis for Computer Vision,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

How Good Is My Test Data? Introducing Safety Analysis for Computer Vision
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2017-06-09 , DOI: 10.1007/s11263-017-1020-z
Oliver Zendel , Markus Murschitz , Martin Humenberger , Wolfgang Herzner

Good test data is crucial for driving new developments in computer vision (CV), but two questions remain unanswered: which situations should be covered by the test data, and how much testing is enough to reach a conclusion? In this paper we propose a new answer to these questions using a standard procedure devised by the safety community to validate complex systems: the hazard and operability analysis (HAZOP). It is designed to systematically identify possible causes of system failure or performance loss. We introduce a generic CV model that creates the basis for the hazard analysis and—for the first time—apply an extensive HAZOP to the CV domain. The result is a publicly available checklist with more than 900 identified individual hazards. This checklist can be utilized to evaluate existing test datasets by quantifying the covered hazards. We evaluate our approach by first analyzing and annotating the popular stereo vision test datasets Middlebury and KITTI. Second, we demonstrate a clearly negative influence of the hazards in the checklist on the performance of six popular stereo matching algorithms. The presented approach is a useful tool to evaluate and improve test datasets and creates a common basis for future dataset designs.

中文翻译：

我的测试数据有多好？介绍计算机视觉的安全分析

良好的测试数据对于推动计算机视觉 (CV) 的新发展至关重要，但有两个问题仍未得到解答：测试数据应涵盖哪些情况，以及多少测试足以得出结论？在本文中，我们使用安全社区设计的用于验证复杂系统的标准程序为这些问题提出了新的答案：危险和可操作性分析 (HAZOP)。它旨在系统地识别系统故障或性能损失的可能原因。我们引入了一个通用的 CV 模型，该模型为危害分析奠定了基础，并首次将广泛的 HAZOP 应用于 CV 领域。结果是一个公开可用的清单，其中包含 900 多个已识别的个人危害。该清单可用于通过量化涵盖的危险来评估现有的测试数据集。我们通过首先分析和注释流行的立体视觉测试数据集 Middlebury 和 KITTI 来评估我们的方法。其次，我们证明了清单中的危险对六种流行的立体匹配算法的性能有明显的负面影响。所提出的方法是评估和改进测试数据集的有用工具，并为未来的数据集设计奠定了共同的基础。

更新日期：2017-06-09

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>