当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Achieving Diverse Objectives with AI-driven Prices in Deep Reinforcement Learning Multi-agent Markets
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-06-10 , DOI: arxiv-2106.06060
Panayiotis Danassis, Aris Filos-Ratsikas, Boi Faltings

We propose a practical approach to computing market prices and allocations via a deep reinforcement learning policymaker agent, operating in an environment of other learning agents. Compared to the idealized market equilibrium outcome -- which we use as a benchmark -- our policymaker is much more flexible, allowing us to tune the prices with regard to diverse objectives such as sustainability and resource wastefulness, fairness, buyers' and sellers' welfare, etc. To evaluate our approach, we design a realistic market with multiple and diverse buyers and sellers. Additionally, the sellers, which are deep learning agents themselves, compete for resources in a common-pool appropriation environment based on bio-economic models of commercial fisheries. We demonstrate that: (a) The introduced policymaker is able to achieve comparable performance to the market equilibrium, showcasing the potential of such approaches in markets where the equilibrium prices can not be efficiently computed. (b) Our policymaker can notably outperform the equilibrium solution on certain metrics, while at the same time maintaining comparable performance for the remaining ones. (c) As a highlight of our findings, our policymaker is significantly more successful in maintaining resource sustainability, compared to the market outcome, in scarce resource environments.

中文翻译:

在深度强化学习多智能体市场中以人工智能驱动的价格实现多样化目标

我们提出了一种通过深度强化学习决策者代理计算市场价格和分配的实用方法,该代理在其他学习代理的环境中运行。与理想化的市场均衡结果(我们将其用作基准)相比,我们的决策者更加灵活,允许我们根据可持续和资源浪费、公平、买卖双方的福利等不同目标调整价格等。为了评估我们的方法,我们设计了一个具有多个不同买家和卖家的现实市场。此外,卖方本身就是深度学习代理,在基于商业渔业生物经济模型的公共池拨款环境中竞争资源。我们证明:(a) 引入的政策制定者能够实现与市场均衡相当的表现,展示了这种方法在无法有效计算均衡价格的市场中的潜力。(b) 我们的决策者可以在某些指标上显着优于均衡解决方案,同时保持其余指标的可比性能。(c) 作为我们发现的一个亮点,与市场结果相比,在稀缺资源环境中,我们的政策制定者在保持资源可持续性方面要成功得多。同时保持其余的可比性能。(c) 作为我们发现的一个亮点,与市场结果相比,在稀缺资源环境中,我们的政策制定者在保持资源可持续性方面要成功得多。同时保持其余的可比性能。(c) 作为我们发现的一个亮点,与市场结果相比,在稀缺资源环境中,我们的政策制定者在保持资源可持续性方面要成功得多。
更新日期:2021-06-14
down
wechat
bug