Exact Converging Bounds for Stochastic Dual Dynamic Programming via Fenchel Duality,SIAM Journal on Optimization

当前位置： X-MOL 学术 › SIAM J. Optim. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Exact Converging Bounds for Stochastic Dual Dynamic Programming via Fenchel Duality
SIAM Journal on Optimization ( IF 3.1 ) Pub Date : 2020-04-28 , DOI: 10.1137/19m1258876
Vincent Leclère , Pierre Carpentier , Jean-Philippe Chancelier , Arnaud Lenoir , François Pacaud

SIAM Journal on Optimization, Volume 30, Issue 2, Page 1223-1250, January 2020.
The stochastic dual dynamic programming (SDDP) algorithm has become one of the main tools used to address convex multistage stochastic optimal control problems. Recently a large amount of work has been devoted to improving the convergence speed of the algorithm through cut selection and regularization, and to extending the field of applications to nonlinear, integer, or risk-averse problems. However, one of the main downsides of the algorithm remains the difficulty in giving an upper bound of the optimal value, usually estimated through Monte Carlo methods and therefore difficult to use in the stopping criterion of the algorithm. In this paper we present a dual SDDP algorithm that yields a converging exact upper bound for the optimal value of the optimization problem. As an easy consequence of our approach, we show how to compute an alternative control policy based on an inner approximation of Bellman value functions instead of the outer approximation given by the standard SDDP algorithm. We illustrate the approach on an energy production problem involving zones of production and transportation links between the zones. The numerical experiments we carry out on this example show the effectiveness of the method.

中文翻译：

通过芬切尔对偶性的随机双重动态规划的精确收敛界

SIAM优化杂志，第30卷，第2期，第1223-1250页，2020年1月。
随机双动态规划（SDDP）算法已经成为解决凸多级随机最优控制问题的主要工具之一。最近，大量工作已致力于通过切割选择和正则化来提高算法的收敛速度，并将应用领域扩展到非线性，整数或规避风险的问题。然而，该算法的主要缺点之一仍然是难以给出通常通过蒙特卡洛方法估计的最优值的上限，因此难以在算法的停止准则中使用。在本文中，我们提出了一种双重SDDP算法，该算法为优化问题的最优值产生了一个收敛的精确上限。作为我们方法的简单结果，我们展示了如何根据Bellman值函数的内部近似而不是标准SDDP算法给出的外部近似来计算替代控制策略。我们说明了涉及生产区域和区域之间的运输联系的能源生产问题的方法。我们在此示例上进行的数值实验证明了该方法的有效性。

更新日期：2020-04-28

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>