当前位置: X-MOL 学术ACM Trans. Intell. Syst. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
From Appearance to Essence
ACM Transactions on Intelligent Systems and Technology ( IF 7.2 ) Pub Date : 2020-09-12 , DOI: 10.1145/3411749
Xiu Susie Fang 1 , Quan Z. Sheng 2 , Xianzhi Wang 3 , Wei Emma Zhang 4 , Anne H. H. Ngu 5 , Jian Yang 2
Affiliation  

Truth discovery has been widely studied in recent years as a fundamental means for resolving the conflicts in multi-source data. Although many truth discovery methods have been proposed based on different considerations and intuitions, investigations show that no single method consistently outperforms the others. To select the right truth discovery method for a specific application scenario, it becomes essential to evaluate and compare the performance of different methods. A drawback of current research efforts is that they commonly assume the availability of certain ground truth for the evaluation of methods. However, the ground truth may be very limited or even impossible to obtain, rendering the evaluation biased. In this article, we present CompTruthHyp , a generic approach for comparing the performance of truth discovery methods without using ground truth. In particular, our approach calculates the probability of observations in a dataset based on the output of different methods. The probability is then ranked to reflect the performance of these methods. We review and compare 12 representative truth discovery methods and consider both single-valued and multi-valued objects. The empirical studies on both real-world and synthetic datasets demonstrate the effectiveness of our approach for comparing truth discovery methods.

中文翻译:

从外观到本质

近年来,真相发现作为解决多源数据冲突的基本手段得到了广泛的研究。尽管基于不同的考虑和直觉提出了许多真相发现方法,但调查表明,没有一种方法始终优于其他方法。要为特定的应用场景选择正确的真相发现方法,评估和比较不同方法的性能变得至关重要。当前研究工作的一个缺点是,他们通常假设某些基本事实可用于评估方法。然而,基本事实可能非常有限甚至不可能获得,从而导致评估存在偏差。在这篇文章中,我们介绍CompTruthHyp,一种通用方法,用于在不使用基本事实的情况下比较真相发现方法的性能。特别是,我们的方法根据不同方法的输出计算数据集中观察的概率。然后对概率进行排序以反映这些方法的性能。我们回顾和比较了 12 种具有代表性的真相发现方法,并考虑了单值和多值对象。对真实世界和合成数据集的实证研究证明了我们的方法在比较真相发现方法方面的有效性。
更新日期:2020-09-12
down
wechat
bug