Inspector Gadget: A Data Programming-based Labeling System for Industrial Images,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Inspector Gadget: A Data Programming-based Labeling System for Industrial Images
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-04-07 , DOI: arxiv-2004.03264
Geon Heo, Yuji Roh, Seonghyeon Hwang, Dayun Lee, Steven Euijong Whang

As machine learning for images becomes democratized in the Software 2.0 era, one of the serious bottlenecks is securing enough labeled data for training. This problem is especially critical in a manufacturing setting where smart factories rely on machine learning for product quality control by analyzing industrial images. Such images are typically large and may only need to be partially analyzed where only a small portion is problematic (e.g., identifying defects on a surface). Since manual labeling these images is expensive, weak supervision is an attractive alternative where the idea is to generate weak labels that are not perfect, but can be produced at scale. Data programming is a recent paradigm in this category where it uses human knowledge in the form of labeling functions and combines them into a generative model. Data programming has been successful in applications based on text or structured data and can also be applied to images usually if one can find a way to convert them into structured data. In this work, we expand the horizon of data programming by directly applying it to images without this conversion, which is a common scenario for industrial applications. We propose Inspector Gadget, an image labeling system that combines crowdsourcing, data augmentation, and data programming to produce weak labels at scale for image classification. We perform experiments on real industrial image datasets and show that Inspector Gadget obtains better performance than other weak-labeling techniques: Snuba, GOGGLES, and self-learning baselines using convolutional neural networks (CNNs) without pre-training.

中文翻译：

Inspector Gadget：基于数据编程的工业图像标签系统

随着图像机器学习在软件 2.0 时代变得民主化，严重的瓶颈之一是确保足够的标记数据用于训练。在智能工厂依靠机器学习通过分析工业图像来控制产品质量的制造环境中，这个问题尤其重要。这样的图像通常很大并且可能只需要在只有一小部分有问题的情况下进行部分分析（例如，识别表面上的缺陷）。由于手动标记这些图像很昂贵，弱监督是一种有吸引力的替代方案，其想法是生成不完美但可以大规模生产的弱标签。数据编程是这一类别的最新范例，它以标记函数的形式使用人类知识，并将它们组合成一个生成模型。数据编程在基于文本或结构化数据的应用中取得了成功，并且通常也可以应用于图像，前提是可以找到一种方法将它们转换为结构化数据。在这项工作中，我们通过直接将其应用于图像而不进行这种转换来扩展数据编程的视野，这是工业应用的常见场景。我们提出了 Inspector Gadget，这是一种图像标签系统，它结合了众包、数据增强和数据编程，可大规模生成用于图像分类的弱标签。我们在真实的工业图像数据集上进行了实验，并表明 Inspector Gadget 获得了比其他弱标记技术更好的性能：Snuba、GOGGLES 和使用卷积神经网络 (CNN) 的自学习基线，无需预训练。这是工业应用的常见场景。我们提出了 Inspector Gadget，这是一种图像标签系统，它结合了众包、数据增强和数据编程，可大规模生成用于图像分类的弱标签。我们在真实的工业图像数据集上进行了实验，并表明 Inspector Gadget 获得了比其他弱标记技术更好的性能：Snuba、GOGGLES 和使用卷积神经网络 (CNN) 的自学习基线，无需预训练。这是工业应用的常见场景。我们提出了 Inspector Gadget，这是一种图像标签系统，它结合了众包、数据增强和数据编程，可大规模生成用于图像分类的弱标签。我们在真实的工业图像数据集上进行了实验，并表明 Inspector Gadget 获得了比其他弱标记技术更好的性能：Snuba、GOGGLES 和使用卷积神经网络 (CNN) 的自学习基线，无需预训练。

更新日期：2020-08-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>