当前位置: X-MOL 学术Auton. Robot. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Representing, learning, and controlling complex object interactions.
Autonomous Robots ( IF 3.7 ) Pub Date : 2018-04-30 , DOI: 10.1007/s10514-018-9740-7
Yilun Zhou 1 , Benjamin Burchfiel 2 , George Konidaris 3
Affiliation  

We present a framework for representing scenarios with complex object interactions, where a robot cannot directly interact with the object it wishes to control and must instead influence it via intermediate objects. For instance, a robot learning to drive a car can only change the car’s pose indirectly via the steering wheel, and must represent and reason about the relationship between its own grippers and the steering wheel, and the relationship between the steering wheel and the car. We formalize these interactions as chains and graphs of Markov decision processes (MDPs) and show how such models can be learned from data. We also consider how they can be controlled given known or learned dynamics. We show that our complex model can be collapsed into a single MDP and solved to find an optimal policy for the combined system. Since the resulting MDP may be very large, we also introduce a planning algorithm that efficiently produces a potentially suboptimal policy. We apply these models to two systems in which a robot uses learning from demonstration to achieve indirect control: playing a computer game using a joystick, and using a hot water dispenser to heat a cup of water.

中文翻译:

表示,学习和控制复杂的对象交互。

我们提供了一个用于表示具有复杂对象交互作用的场景的框架,其中机器人无法直接与其希望控制的对象进行交互,而必须通过中间对象来影响它。例如,学习驾驶汽车的机器人只能通过方向盘间接改变汽车的姿势,并且必须表示并推断其自身的抓取器和方向盘之间的关系以及方向盘和汽车之间的关系。我们将这些交互形式化为马尔可夫决策过程(MDP)的链和图,并说明如何从数据中学习此类模型。我们还将考虑在已知或学到的动力学情况下如何控制它们。我们表明,我们的复杂模型可以分解为单个MDP,并可以求解以找到组合系统的最佳策略。由于生成的MDP可能非常大,因此我们还引入了一种规划算法,可以有效地产生潜在的次优策略。我们将这些模型应用于两个系统,在这些系统中,机器人使用从演示中学习的知识来实现​​间接控制:使用操纵杆玩电脑游戏,以及使用热水分配器加热一杯水。
更新日期:2018-04-30
down
wechat
bug