当前位置: X-MOL 学术Sci. Comput. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Synthesizing safe policies under probabilistic constraints with reinforcement learning and Bayesian model checking
Science of Computer Programming ( IF 1.5 ) Pub Date : 2021-02-04 , DOI: 10.1016/j.scico.2021.102620
Lenz Belzner , Martin Wirsing

We propose to leverage epistemic uncertainty about constraint satisfaction of a reinforcement learner in safety critical domains. We introduce a framework for specification of requirements for reinforcement learners in constrained settings, including confidence about results. We show that an agent's confidence in constraint satisfaction provides a useful signal for balancing optimization and safety in the learning process.



中文翻译:

通过强化学习和贝叶斯模型检查在概率约束下综合安全策略

我们建议利用关于安全性关键领域的强化学习者的约束满足的认知不确定性。我们引入了一个框架,用于规范受约束环境中的强化学习者的要求,包括对结果的信心。我们表明,代理对约束满足的信心为学习过程中的优化和安全平衡提供了有用的信号。

更新日期:2021-02-19
down
wechat
bug