A Multiarmed Bandit Approach to Adaptive Water Quality Management.,Integrated Environmental Assessment and Management

当前位置： X-MOL 学术 › Integr. Environ. Assess. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Multiarmed Bandit Approach to Adaptive Water Quality Management.
Integrated Environmental Assessment and Management ( IF 3.1 ) Pub Date : 2020-06-25 , DOI: 10.1002/ieam.4302
David M Martin ₁ , Fred A Johnson ₂

Affiliation

Nonpoint source water quality management is challenged with allocating uncertain management actions and monitoring their performance in the absence of state‐dependent decision making. This adaptive management context can be expressed as a multiarmed bandit problem. Multiarmed bandit strategies attempt to balance the exploitation of actions that appear to maximize performance with the exploration of uncertain, but potentially better, actions. We performed a test of multiarmed bandit strategies to inform adaptive water quality management in Massachusetts, USA. Conservation and restoration practitioners were tasked with allocating household wastewater treatments to minimize N inputs to impaired waters. We obtained time series of N monitoring data from 3 wastewater treatment types and organized them chronologically and randomly. The chronological data set represented nonstationary performance based on recent monitoring data, whereas the random data set represented stationary performance. We tested 2 multiarmed bandit strategies in hypothetical experiments to sample from the treatment data through 20 sequential decisions. A deterministic probability‐matching strategy allocated treatments with the highest probability of success regarding their performance at each decision. A randomized probability‐matching strategy randomly allocated treatments according to their probability of success at each decision. The strategies were compared with a nonadaptive strategy that equally allocated treatments at each decision. Results indicated that equal allocation is useful for learning in nonstationary situations but tended to overexplore inferior treatments and thus did not maximize performance when compared with the other strategies. Deterministic probability matching maximized performance in many stationary situations, but the strategy did not adequately explore treatments and converged on inferior treatments in nonstationary situations. Randomized probability matching balanced performance and learning in stationary situations, but the strategy could converge on inferior treatments in nonstationary situations. These findings provide evidence that probability‐matching strategies are useful for adaptive management. Integr Environ Assess Manag 2020;16:841–852. © 2020 The Authors. Integrated Environmental Assessment and Management published by Wiley Periodicals LLC on behalf of Society of Environmental Toxicology & Chemistry (SETAC)

中文翻译：

自适应水质管理的多臂强盗方法。

非点源水质管理面临着分配不确定的管理行动并在没有依赖状态的决策的情况下监测其绩效的挑战。这种自适应管理上下文可以表示为多臂老虎机问题。多臂老虎机策略试图平衡利用看似最大化性能的行动与探索不确定但可能更好的行动。我们对多臂老虎机策略进行了测试，为美国马萨诸塞州的自适应水质管理提供信息。保护和恢复从业者的任务是分配家庭废水处理，以尽量减少对受损水域的氮输入。我们从 3 种废水处理类型中获得了 N 监测数据的时间序列，并按时间顺序和随机组织了它们。时序数据集表示基于最近监测数据的非平稳性能，而随机数据集表示平稳性能。我们在假设实验中测试了 2 种多臂老虎机策略，通过 20 个连续决策从治疗数据中采样。确定性概率匹配策略根据每个决策的表现分配具有最高成功概率的治疗。随机概率匹配策略根据每个决策的成功概率随机分配治疗。将这些策略与在每个决策中平均分配治疗的非适应性策略进行了比较。结果表明，平等分配对于在非平稳情况下的学习很有用，但倾向于过度探索劣质治疗，因此与其他策略相比并没有最大限度地提高性能。确定性概率匹配在许多静止情况下最大化性能，但该策略没有充分探索治疗方法，并且在非静止情况下收敛于劣质治疗方法。随机概率匹配平衡了平稳情况下的性能和学习，但该策略可能会收敛于非平稳情况下的劣质处理。这些发现提供了概率匹配策略对适应性管理有用的证据。但该策略没有充分探索治疗方法，而是集中在非平稳情况下的劣质治疗方法上。随机概率匹配平衡了平稳情况下的性能和学习，但该策略可能会收敛于非平稳情况下的劣质处理。这些发现提供了概率匹配策略对适应性管理有用的证据。但该策略没有充分探索治疗方法，而是集中在非平稳情况下的劣质治疗方法上。随机概率匹配平衡了平稳情况下的性能和学习，但该策略可能会收敛于非平稳情况下的劣质处理。这些发现提供了概率匹配策略对适应性管理有用的证据。2020 年整合环境评估管理；16：841–852。© 2020 作者。由 Wiley Periodicals LLC 代表环境毒理学与化学学会 (SETAC) 出版的综合环境评估和管理

更新日期：2020-06-25

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>