An extended ϵ-constraint method for a multiobjective finite-horizon Markov decision process,International Transactions in Operational Research

当前位置： X-MOL 学术 › Int. Trans. Oper. Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An extended ϵ-constraint method for a multiobjective finite-horizon Markov decision process
International Transactions in Operational Research ( IF 3.1 ) Pub Date : 2021-05-04 , DOI: 10.1111/itor.12989
Maryam Eghbali‐Zarch ₁ , Reza Tavakkoli‐Moghaddam ₁ , Amir Azaron _{2,

3} , Kazem Dehghan‐Sanej ₄

Affiliation

A Markov decision process (MDP) is an appropriate mathematical framework for analysis and modeling a large class of sequential decision-making problems. Real-world applications necessitate the evaluation of the value of a decision according to several conflicting objectives. This paper presents an extended ϵ-constraint method for a multiobjective finite-horizon MDP. This study integrates the ϵ-constraint method with the K-best policies algorithm to find the nondominated deterministic Markovian policies on the Pareto-optimal frontier. The proposed algorithm is evaluated on biobjective maintenance scheduling and machine running speed selection problems, and its performance is compared with a classic approach in the literature (weighted-sum, WS, method). Satisfying results show that the proposed algorithm obtains a good-quality Pareto frontier and has advantages over the WS method.

中文翻译：

多目标有限域马尔可夫决策过程的扩展ε-约束方法

马尔可夫决策过程 (MDP) 是一种适用于分析和建模大量顺序决策问题的数学框架。现实世界的应用程序需要根据几个相互冲突的目标来评估决策的价值。本文提出了一种用于多目标有限视野 MDP 的扩展 ε-约束方法。本研究将 ε-约束方法与K-在帕累托最优边界上找到非支配确定性马尔可夫策略的最佳策略算法。所提出的算法在双目标维护调度和机器运行速度选择问题上进行了评估，并将其性能与文献中的经典方法（加权和，WS，方法）进行了比较。令人满意的结果表明，所提出的算法获得了高质量的帕累托前沿，并且优于WS方法。

更新日期：2021-05-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>