当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
To Monitor Or Not: Observing Robot's Behavior based on a Game-Theoretic Model of Trust
arXiv - CS - Computer Science and Game Theory Pub Date : 2019-03-01 , DOI: arxiv-1903.00111
Sailik Sengupta, Zahra Zahedi, Subbarao Kambhampati

In scenarios where a robot generates and executes a plan, there may be instances where this generated plan is less costly for the robot to execute but incomprehensible to the human. When the human acts as a supervisor and is held accountable for the robot's plan, the human may be at a higher risk if the incomprehensible behavior is deemed to be infeasible or unsafe. In such cases, the robot, who may be unaware of the human's exact expectations, may choose to execute (1) the most constrained plan (i.e. one preferred by all possible supervisors) incurring the added cost of executing highly sub-optimal behavior when the human is monitoring it and (2) deviate to a more optimal plan when the human looks away. While robots do not have human-like ulterior motives (such as being lazy), such behavior may occur because the robot has to cater to the needs of different human supervisors. In such settings, the robot, being a rational agent, should take any chance it gets to deviate to a lower cost plan. On the other hand, continuous monitoring of the robot's behavior is often difficult for humans because it costs them valuable resources (e.g., time, cognitive overload, etc.). Thus, to optimize the cost for monitoring while ensuring the robots follow the safe behavior, we model this problem in the game-theoretic framework of trust. In settings where the human does not initially trust the robot, pure-strategy Nash Equilibrium provides a useful policy for the human.

中文翻译:

监视与否:基于信任的博弈论模型观察机器人的行为

在机器人生成并执行计划的场景中,可能存在这样的情况,即生成的计划对于机器人来说执行成本较低,但人类无法理解。当人类充当监督者并对机器人的计划负责时,如果认为无法理解的行为不可行或不安全,人类可能面临更高的风险。在这种情况下,机器人可能不知道人类的确切期望,可能会选择执行 (1) 最受约束的计划(即所有可能的主管都喜欢的计划),这会导致执行高度次优行为的额外成本,当人类正在监视它并且(2)当人类移开视线时偏离更优化的计划。虽然机器人没有像人类一样的别有用心(比如偷懒),这种行为可能会发生,因为机器人必须满足不同人类监管者的需求。在这种情况下,作为理性代理的机器人应该抓住任何机会偏离成本较低的计划。另一方面,对机器人行为的连续监控对人类来说通常是困难的,因为这会耗费他们宝贵的资源(例如,时间、认知过载等)。因此,为了在确保机器人遵循安全行为的同时优化监控成本,我们在信任的博弈论框架中对这个问题进行建模。在人类最初不信任机器人的环境中,纯策略纳什均衡为人类提供了有用的策略。另一方面,对机器人行为的连续监控对人类来说通常是困难的,因为这会耗费他们宝贵的资源(例如,时间、认知过载等)。因此,为了在确保机器人遵循安全行为的同时优化监控成本,我们在信任的博弈论框架中对这个问题进行建模。在人类最初不信任机器人的环境中,纯策略纳什均衡为人类提供了有用的策略。另一方面,对机器人行为的连续监控对人类来说通常是困难的,因为这会耗费他们宝贵的资源(例如,时间、认知过载等)。因此,为了在确保机器人遵循安全行为的同时优化监控成本,我们在信任的博弈论框架中对这个问题进行建模。在人类最初不信任机器人的环境中,纯策略纳什均衡为人类提供了有用的策略。
更新日期:2020-01-27
down
wechat
bug