当前位置: X-MOL 学术Ann. Appl. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Markov decision processes with dynamic transition probabilities: An analysis of shooting strategies in basketball
Annals of Applied Statistics ( IF 1.3 ) Pub Date : 2020-09-18 , DOI: 10.1214/20-aoas1348
Nathan Sandholtz , Luke Bornn

In this paper we model basketball plays as episodes from team-specific nonstationary Markov decision processes (MDPs) with shot clock dependent transition probabilities. Bayesian hierarchical models are employed in the modeling and parametrization of the transition probabilities to borrow strength across players and through time. To enable computational feasibility, we combine lineup-specific MDPs into team-average MDPs using a novel transition weighting scheme. Specifically, we derive the dynamics of the team-average process such that the expected transition count for an arbitrary state-pair is equal to the weighted sum of the expected counts of the separate lineup-specific MDPs. We then utilize these nonstationary MDPs in the creation of a basketball play simulator with uncertainty propagated via posterior samples of the model components. After calibration, we simulate seasons both on-policy and under altered policies and explore the net changes in efficiency and production under the alternate policies. Additionally, we discuss the game-theoretic ramifications of testing alternative decision policies.

中文翻译:

具有动态转移概率的马尔可夫决策过程:篮球投篮策略分析

在本文中,我们将篮球比赛建模为特定于团队的非平稳Markov决策过程(MDP)的情节,并具有与发球时间相关的过渡概率。贝叶斯分层模型用于过渡概率的建模和参数化,以在各个参与者之间以及整个时间中借用力量。为了实现计算的可行性,我们使用新颖的过渡加权方案将特定于阵容的MDP组合为团队平均MDP。具体来说,我们推导出团队平均过程的动态,以使任意状态对的预期转换计数等于单独的特定于阵容的MDP的预期计数的加权和。然后,我们在创建篮球比赛模拟器时利用这些非平稳MDP,并通过模型组件的后采样传播不确定性。校准后,我们​​将模拟政策和变更政策下的季节,并探索替代政策下效率和产量的净变化。此外,我们讨论了测试替代决策策略的博弈论后果。
更新日期:2020-11-18
down
wechat
bug