当前位置: X-MOL 学术IEEE Robot. Automation Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning From Demonstrations Using Signal Temporal Logic in Stochastic and Continuous Domains
IEEE Robotics and Automation Letters ( IF 4.6 ) Pub Date : 2021-06-28 , DOI: 10.1109/lra.2021.3092676
Aniruddh Gopinath Puranic , Jyotirmoy Deshmukh , Stefanos Nikolaidis

Learning control policies that are safe, robust and interpretable are prominent challenges in developing robotic systems. Learning-from-demonstrations with formal logic is an arising paradigm in reinforcement learning to estimate rewards and extract robot control policies that seek to overcome these challenges. In this approach, we assume that mission-level specifications for the robotic system are expressed in a suitable temporal logic such as Signal Temporal Logic (STL). The main idea is to automatically infer rewards from user demonstrations (that could be suboptimal or incomplete) by evaluating and ranking them w.r.t. the given STL specifications. In contrast to existing work that focuses on deterministic environments and discrete state spaces, in this letter, we propose significant extensions that tackle stochastic environments and continuous state spaces.

中文翻译:


从随机域和连续域中使用信号时态逻辑的演示中学习



学习安全、稳健和可解释的控制策略是开发机器人系统的突出挑战。使用形式逻辑从演示中学习是强化学习中的一个新兴范式,用于估计奖励并提取机器人控制策略来克服这些挑战。在这种方法中,我们假设机器人系统的任务级规范以合适的时间逻辑(例如信号时间逻辑(STL))来表达。主要思想是通过根据给定的 STL 规范对用户演示(可能是次优的或不完整的)进行评估和排名,自动推断用户演示的奖励。与专注于确定性环境和离散状态空间的现有工作相比,在这封信中,我们提出了解决随机环境和连续状态空间的重大扩展。
更新日期:2021-06-28
down
wechat
bug