当前位置: X-MOL 学术Knowl. Based Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Noise label learning through label confidence statistical inference
Knowledge-Based Systems ( IF 8.8 ) Pub Date : 2021-06-15 , DOI: 10.1016/j.knosys.2021.107234
Min Wang , Hong-Tian Yu , Fan Min

Noise label exists widely in real-world data, resulting in the degradation of classification performance. Popular methods require a known noise distribution or additional cleaning supervision, which is usually unavailable in practical scenarios. This paper presents a theoretical statistical method and designs a label confidence inference (LISR) algorithm to handle this issue. For data distribution, we define a statistical function for label inconsistency and analyze its relationship with neighbor radius. For data representation, we define trusted-neighbor, nearest-trusted-neighbor and untrusted-neighbor. For noisy label recognition, we present three inference methods to predict the labels and their confidence. The LISR algorithm establishes a practical statistical model, queries the initial trusted instances, iteratively searches for the trusted instances and corrects labels. We conducted experiments on synthetic, UCI and classic image datasets. The results of significance test verified the effectiveness of LISR and its superiority to the state-of-the-art noise label learning algorithms.



中文翻译:

通过标签置信度统计推断进行噪声标签学习

噪声标签广泛存在于现实世界的数据中,导致分类性能下降。流行的方法需要已知的噪声分布或额外的清洁监督,这在实际场景中通常是不可用的。本文提出了一种理论统计方法,并设计了一种标签置信推理 (LISR) 算法来处理这个问题。对于数据分布,我们定义了标签不一致的统计函数,并分析其与邻居半径的关系。对于数据表示,我们定义了可信邻居、最近可信邻居和不可信邻居。对于嘈杂的标签识别,我们提出了三种推理方法来预测标签及其置信度。LISR算法建立实用的统计模型,查询初始可信实例,迭代搜索可信实例并更正标签。我们对合成、UCI 和经典图像数据集进行了实验。显着性检验的结果验证了 LISR 的有效性及其相对于最先进的噪声标签学习算法的优越性。

更新日期:2021-06-18
down
wechat
bug