当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Constrained Labeling for Weakly Supervised Learning
arXiv - CS - Artificial Intelligence Pub Date : 2020-09-15 , DOI: arxiv-2009.07360
Chidubem Arachie, Bert Huang

Curation of large fully supervised datasets has become one of the major roadblocks for machine learning. Weak supervision provides an alternative to supervised learning by training with cheap, noisy, and possibly correlated labeling functions from varying sources. The key challenge in weakly supervised learning is combining the different weak supervision signals while navigating misleading correlations in their errors. In this paper, we propose a simple data-free approach for combining weak supervision signals by defining a constrained space for the possible labels of the weak signals and training with a random labeling within this constrained space. Our method is efficient and stable, converging after a few iterations of gradient descent. We prove theoretical conditions under which the worst-case error of the randomized label decreases with the rank of the linear constraints. We show experimentally that our method outperforms other weak supervision methods on various text- and image-classification tasks.

中文翻译:

弱监督学习的约束标记

大型完全监督数据集的管理已成为机器学习的主要障碍之一。弱监督通过使用来自不同来源的廉价、嘈杂且可能相关的标签函数进行训练,提供了监督学习的替代方案。弱监督学习的关键挑战是结合不同的弱监督信号,同时在它们的错误中导航误导性的相关性。在本文中,我们提出了一种简单的无数据方法,通过为弱信号的可能标签定义约束空间并在该约束空间内使用随机标签进行训练,来组合弱监督信号。我们的方法高效且稳定,经过几次梯度下降迭代后收敛。我们证明了随机标签的最坏情况误差随线性约束的等级而降低的理论条件。我们通过实验表明,我们的方法在各种文本和图像分类任务上优于其他弱监督方法。
更新日期:2020-09-17
down
wechat
bug