当前位置: X-MOL 学术IEEE Trans. Knowl. Data. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
r-HUMO: A Risk-aware Human-Machine Cooperation Framework for Entity Resolution with Quality Guarantees
IEEE Transactions on Knowledge and Data Engineering ( IF 8.9 ) Pub Date : 2020-02-01 , DOI: 10.1109/tkde.2018.2883532
Boyi Hou , Qun Chen , Zhaoqiang Chen , Youcef Nafa , Zhanhuai Li

Even though many approaches have been proposed for entity resolution (ER), it remains very challenging to enforce quality guarantees. To this end, we propose a risk-aware HUman-Machine cOoperation framework for ER, denoted by r-HUMO. Built on the existing HUMO framework, r-HUMO similarly enforces both precision and recall guarantees by partitioning an ER workload between the human and the machine. However, r-HUMO is the first solution that optimizes the process of human workload selection from a risk perspective. It iteratively selects human workload by real-time risk analysis based on the human-labeled results as well as the pre-specified machine metric. In this paper, we first introduce the r-HUMO framework and then present the risk model to prioritize the instances for manual inspection. Finally, we empirically evaluate r-HUMO's performance on real data. Our extensive experiments show that r-HUMO is effective in enforcing quality guarantees, and compared with the state-of-the-art alternatives, it can achieve desired quality control with reduced human cost.

中文翻译:

r-HUMO:具有质量保证的实体解析的风险意识人机合作框架

尽管已经提出了许多用于实体解析 (ER) 的方法,但执行质量保证仍然非常具有挑战性。为此,我们为 ER 提出了一个具有风险意识的人机合作框架,用 r-HUMO 表示。r-HUMO 建立在现有的 HUMO 框架上,通过在人和机器之间划分 ER 工作负载,类似地强制执行精度和召回保证。然而,r-HUMO 是第一个从风险角度优化人工工作负载选择过程的解决方案。它根据人工标记的结果和预先指定的机器指标,通过实时风险分析来迭代地选择人工工作量。在本文中,我们首先介绍了 r-HUMO 框架,然后提出了风险模型来对人工检查的实例进行优先级排序。最后,我们凭经验评估 r-HUMO' s 在真实数据上的表现。我们的大量实验表明,r-HUMO 在执行质量保证方面是有效的,与最先进的替代方案相比,它可以在降低人力成本的情况下实现所需的质量控制。
更新日期:2020-02-01
down
wechat
bug