Improving Confidence in the Estimation of Values and Norms,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Improving Confidence in the Estimation of Values and Norms
arXiv - CS - Artificial Intelligence Pub Date : 2020-04-02 , DOI: arxiv-2004.01056
Luciano Cavalcante Siebert, Rijk Mercuur, Virginia Dignum, Jeroen van den Hoven, Catholijn Jonker

Autonomous agents (AA) will increasingly be interacting with us in our daily lives. While we want the benefits attached to AAs, it is essential that their behavior is aligned with our values and norms. Hence, an AA will need to estimate the values and norms of the humans it interacts with, which is not a straightforward task when solely observing an agent's behavior. This paper analyses to what extent an AA is able to estimate the values and norms of a simulated human agent (SHA) based on its actions in the ultimatum game. We present two methods to reduce ambiguity in profiling the SHAs: one based on search space exploration and another based on counterfactual analysis. We found that both methods are able to increase the confidence in estimating human values and norms, but differ in their applicability, the latter being more efficient when the number of interactions with the agent is to be minimized. These insights are useful to improve the alignment of AAs with human values and norms.

中文翻译：

提高对价值和规范估计的信心

自治代理 (AA) 将越来越多地在我们的日常生活中与我们互动。虽然我们想要 AA 附带的好处，但他们的行为必须符合我们的价值观和规范。因此，AA 将需要估计与之交互的人类的价值观和规范，这在仅观察代理的行为时并不是一项简单的任务。本文分析了 AA 能够根据其在最后通牒游戏中的行为来估计模拟人类代理 (SHA) 的值和规范的程度。我们提出了两种方法来减少分析 SHA 时的歧义：一种基于搜索空间探索，另一种基于反事实分析。我们发现这两种方法都能够增加估计人类价值观和规范的信心，但它们的适用性不同，当与代理的交互次数最少时，后者更有效。这些见解有助于改善 AA 与人类价值观和规范的一致性。

更新日期：2020-04-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>