当前位置:
X-MOL 学术
›
arXiv.cs.HC
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A Socially Aware Reinforcement Learning Agent for The Single Track Road Problem
arXiv - CS - Human-Computer Interaction Pub Date : 2021-09-12 , DOI: arxiv-2109.05486 Ido Shapira, Amos Azaria
arXiv - CS - Human-Computer Interaction Pub Date : 2021-09-12 , DOI: arxiv-2109.05486 Ido Shapira, Amos Azaria
We present the single track road problem. In this problem two agents face
each-other at opposite positions of a road that can only have one agent pass at
a time. We focus on the scenario in which one agent is human, while the other
is an autonomous agent. We run experiments with human subjects in a simple grid
domain, which simulates the single track road problem. We show that when data
is limited, building an accurate human model is very challenging, and that a
reinforcement learning agent, which is based on this data, does not perform
well in practice. However, we show that an agent that tries to maximize a
linear combination of the human's utility and its own utility, achieves a high
score, and significantly outperforms other baselines, including an agent that
tries to maximize only its own utility.
中文翻译:
单轨道路问题的社会意识强化学习代理
我们提出了单轨道路问题。在这个问题中,两个代理在一条道路的相对位置面对面,一次只能让一个代理通过。我们专注于一个代理是人类而另一个是自主代理的场景。我们在一个简单的网格域中对人类受试者进行实验,模拟单轨道路问题。我们表明,当数据有限时,构建准确的人体模型非常具有挑战性,并且基于这些数据的强化学习代理在实践中表现不佳。然而,我们表明,试图最大化人类效用和自身效用的线性组合的代理获得了高分,并且明显优于其他基线,包括试图仅最大化其自身效用的代理。
更新日期:2021-09-14
中文翻译:
单轨道路问题的社会意识强化学习代理
我们提出了单轨道路问题。在这个问题中,两个代理在一条道路的相对位置面对面,一次只能让一个代理通过。我们专注于一个代理是人类而另一个是自主代理的场景。我们在一个简单的网格域中对人类受试者进行实验,模拟单轨道路问题。我们表明,当数据有限时,构建准确的人体模型非常具有挑战性,并且基于这些数据的强化学习代理在实践中表现不佳。然而,我们表明,试图最大化人类效用和自身效用的线性组合的代理获得了高分,并且明显优于其他基线,包括试图仅最大化其自身效用的代理。