Learning for Open-World Calibration with Graph Neural Networks,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning for Open-World Calibration with Graph Neural Networks
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2023-05-19 , DOI: arxiv-2305.12039
Qin Zhang, Dongsheng An, Tianjun Xiao, Tong He, Qingming Tang, Ying Nian Wu, Joseph Tighe, Yifan Xing

We tackle the problem of threshold calibration for open-world recognition by incorporating representation compactness measures into clustering. Unlike the open-set recognition which focuses on discovering and rejecting the unknown, open-world recognition learns robust representations that are generalizable to disjoint unknown classes at test time. Our proposed method is based on two key observations: (i) representation structures among neighbouring images in high dimensional visual embedding spaces have strong self-similarity which can be leveraged to encourage transferability to the open world, (ii) intra-class embedding structures can be modeled with the marginalized von Mises-Fisher (vMF) probability, whose correlation with the true positive rate is dataset-invariant. Motivated by these, we design a unified framework centered around a graph neural network (GNN) to jointly predict the pseudo-labels and the vMF concentrations which indicate the representation compactness. These predictions can be converted into statistical estimations for recognition accuracy, allowing more robust calibration of the distance threshold to achieve target utility for the open-world classes. Results on a variety of visual recognition benchmarks demonstrate the superiority of our method over traditional posthoc calibration methods for the open world, especially under distribution shift.

中文翻译：

使用图形神经网络学习开放世界校准

我们通过将表示紧凑性度量纳入聚类来解决开放世界识别的阈值校准问题。与专注于发现和拒绝未知的开放集识别不同，开放世界识别学习鲁棒的表示，这些表示可以在测试时推广到不相交的未知类。我们提出的方法基于两个关键观察结果：(i) 高维视觉嵌入空间中相邻图像之间的表示结构具有很强的自相似性，可以用来促进对开放世界的可迁移性，(ii) 类内嵌入结构可以用边缘化的 von Mises-Fisher (vMF) 概率建模，其与真阳性率的相关性是数据集不变的。受这些刺激，我们设计了一个以图神经网络 (GNN) 为中心的统一框架，以联合预测伪标签和表示表示紧凑性的 vMF 浓度。这些预测可以转换为识别准确性的统计估计，从而允许对距离阈值进行更稳健的校准，以实现开放世界类别的目标效用。各种视觉识别基准的结果证明了我们的方法优于开放世界的传统事后校准方法，尤其是在分布变化的情况下。允许对距离阈值进行更稳健的校准，以实现开放世界类的目标效用。各种视觉识别基准的结果证明了我们的方法优于开放世界的传统事后校准方法，尤其是在分布变化的情况下。允许对距离阈值进行更稳健的校准，以实现开放世界类的目标效用。各种视觉识别基准的结果证明了我们的方法优于开放世界的传统事后校准方法，尤其是在分布变化的情况下。

更新日期：2023-05-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>