当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cold-Start Active Sampling via ɣ-Tube.
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 2021-04-20 , DOI: 10.1109/tcyb.2021.3069956
Xiaofeng Cao , Ivor W. Tsang , Jianliang Xu

Active learning (AL) improves the generalization performance for the current classification hypothesis by querying labels from a pool of unlabeled data. The sampling process is typically assessed by an informative, representative, or diverse evaluation policy. However, the policy, which needs an initial labeled set to start, may degenerate its performance in a cold-start hypothesis. In this article, we first show that typical AL sampling can be equivalently formulated as geometric sampling over minimum enclosing balls1 (MEBs) of clusters. Following the ɣ-tube structure in geometric clustering, we then divide one MEB covering a cluster into two parts: 1) a ɣ-tube and 2) a ɣ-ball. By estimating the error disagreement between sampling in MEB and ɣ-ball, our theoretical insight reveals that ɣ-tube can effectively measure the disagreement of hypotheses in original space over MEB and sampling space over ɣ-ball. To tighten our insight, we present generalization analysis, and the results show that sampling in ɣ-tube can derive higher probability bound to achieve a nearly zero generalization error. With these analyses, we finally apply the informative sampling policy of AL over ɣ-tube to present a tube AL (TAL) algorithm against the cold-start sampling issue. As a result, the dependency between the querying process and the evaluation policy of active sampling can be alleviated. Experimental results show that by using the ɣ-tube structure to deal with cold-start sampling, TAL achieves the superior performance than standard AL evaluation baselines by presenting substantial accuracy improvements. Image edge recognition extends our theoretical results.

中文翻译:

通过ɣ管进行冷启动有源采样。

主动学习(AL)通过从未标记数据池中查询标签来提高当前分类假设的泛化性能。抽样过程通常由信息丰富,具有代表性或多样化的评估政策进行评估。但是,需要初始标签集才能启动的策略可能会在冷启动假设中降低其性能。在本文中,我们首先表明,典型的AL采样可以等效地表示为群集的最小封闭球1(MEB)上的几何采样。遵循几何聚类中的tube管结构,然后将覆盖一个群集的一个MEB分为两部分:1)ɣ管和2)ɣ球。通过估算MEB和ɣ球采样之间的误差不一致,我们的理论洞察力表明,ɣ形管可以有效地度量MEB上原始空间和ɣ球上的采样空间之间的假设差异。为了加强我们的洞察力,我们提出了泛化分析,结果表明,在tube管中进行采样可以得出更高的概率范围,从而实现了几乎为零的泛化误差。通过这些分析,我们最终在ɣ管上应用了AL的信息采样策略,针对冷启动采样问题提出了管AL(TAL)算法。结果,可以减轻查询过程和主动采样的评估策略之间的依赖性。实验结果表明,通过使用ɣ管结构处理冷启动采样,TAL通过显着提高准确度,获得了比标准AL评估基准更好的性能。
更新日期:2021-04-20
down
wechat
bug