Synthesizing safe policies under probabilistic constraints with reinforcement learning and Bayesian model checking,Science of Computer Programming

当前位置： X-MOL 学术 › Sci. Comput. Program. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Synthesizing safe policies under probabilistic constraints with reinforcement learning and Bayesian model checking
Science of Computer Programming ( IF 1.5 ) Pub Date : 2021-02-04 , DOI: 10.1016/j.scico.2021.102620
Lenz Belzner , Martin Wirsing

We propose to leverage epistemic uncertainty about constraint satisfaction of a reinforcement learner in safety critical domains. We introduce a framework for specification of requirements for reinforcement learners in constrained settings, including confidence about results. We show that an agent's confidence in constraint satisfaction provides a useful signal for balancing optimization and safety in the learning process.

中文翻译：