当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Compact and Efficient Encodings for Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models
Artificial Intelligence ( IF 14.4 ) Pub Date : 2020-08-01 , DOI: 10.1016/j.artint.2020.103291
Buser Say , Scott Sanner

In this paper, we leverage the efficiency of Binarized Neural Networks (BNNs) to learn complex state transition models of planning domains with discretized factored state and action spaces. In order to directly exploit this transition structure for planning, we present two novel compilations of the learned factored planning problem with BNNs based on reductions to Weighted Partial Maximum Boolean Satisfiability (FD-SAT-Plan+) as well as Binary Linear Programming (FD-BLP-Plan+). Theoretically, we show that our SAT-based Bi-Directional Neuron Activation Encoding is asymptotically the most compact encoding relative to the current literature and supports Unit Propagation (UP) -- an important property that facilitates efficiency in SAT solvers. Experimentally, we validate the computational efficiency of our Bi-Directional Neuron Activation Encoding in comparison to an existing neuron activation encoding and demonstrate the ability to learn complex transition models with BNNs. We test the runtime efficiency of both FD-SAT-Plan+ and FD-BLP-Plan+ on the learned factored planning problem showing that FD-SAT-Plan+ scales better with increasing BNN size and complexity. Finally, we present a finite-time incremental constraint generation algorithm based on generalized landmark constraints to improve the planning accuracy of our encodings through simulated or real-world interaction.

中文翻译:

使用学习的二值化神经网络转换模型在分解状态和动作空间中进行规划的紧凑高效编码

在本文中,我们利用二值化神经网络 (BNN) 的效率来学习具有离散分解状态和动作空间的规划域的复杂状态转换模型。为了直接利用这种过渡结构进行规划,我们提出了两种新的 BNN 学习因子规划问题的汇编,基于加权部分最大布尔可满足性 (FD-SAT-Plan+) 和二元线性规划 (FD-BLP) -计划+)。从理论上讲,我们表明,我们的基于 SAT 的双向神经元激活编码是相对于当前文献的渐近编码,并支持单位传播 (UP)——这是一种促进 SAT 求解器效率的重要属性。实验上,与现有的神经元激活编码相比,我们验证了我们的双向神经元激活编码的计算效率,并展示了使用 BNN 学习复杂转换模型的能力。我们在学习的因子规划问题上测试了 FD-SAT-Plan+ 和 FD-BLP-Plan+ 的运行时效率,表明 FD-SAT-Plan+ 随着 BNN 大小和复杂性的增加而扩展得更好。最后,我们提出了一种基于广义地标约束的有限时间增量约束生成算法,以通过模拟或现实世界的交互来提高编码的规划精度。我们在学习的因子规划问题上测试了 FD-SAT-Plan+ 和 FD-BLP-Plan+ 的运行时效率,表明 FD-SAT-Plan+ 随着 BNN 大小和复杂性的增加而扩展得更好。最后,我们提出了一种基于广义地标约束的有限时间增量约束生成算法,以通过模拟或现实世界的交互来提高编码的规划精度。我们在学习的因子规划问题上测试了 FD-SAT-Plan+ 和 FD-BLP-Plan+ 的运行时效率,表明 FD-SAT-Plan+ 随着 BNN 大小和复杂性的增加而扩展得更好。最后,我们提出了一种基于广义地标约束的有限时间增量约束生成算法,以通过模拟或现实世界的交互来提高我们编码的规划精度。
更新日期:2020-08-01
down
wechat
bug