当前位置: X-MOL 学术IEEE Comput. Intell. Mag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Strength Adjustment and Assessment for MCTS-Based Programs [Research Frontier]
IEEE Computational Intelligence Magazine ( IF 9 ) Pub Date : 2020-08-01 , DOI: 10.1109/mci.2020.2998315
An-Jen Liu , Ti-Rong Wu , I-Chen Wu , Hung Guei , Ting-Han Wei

This paper proposes an approach to strength adjustment and assessment for Monte-Carlo tree search based game-playing programs. We modify an existing softmax policy with a strength index to choose moves. The most important modification is a mechanism which filters low-quality moves by excluding those that have a lower simulation count than a pre-defined threshold ratio of the maximum simulation count. Through theoretical analysis, we show that the adjusted policy is guaranteed to choose moves exceeding a lower bound in strength by using a threshold ratio. Experimental results show that the strength index is highly correlated to the empirical strength. With an index value between ?2, we can cover a strength range of about 800 Elo ratings. The strength adjustment and assessment methods were also tested in real-world scenarios with human players, ranging from professionals (strongest) to kyu rank amateurs (weakest). For amateur levels, we tested our mechanism on two popular Go online platforms - Fox Weiqi and Tygem. The result shows that our method can adjust program strength to different ranks stably. In terms of strength assessment, we proposed a new dynamic strength adjustment method, then used it to evaluate human professionals, predicting reliably their playing strengths within 15 games. Lastly, we collected survey responses asking players about strength perception, entertainment, and general comments for different aspects of analysis. To our best knowledge, this result is state-ofthe- art in terms of the range of strengths in Elo rating while maintaining a controllable relationship between the strength and a strength index.

中文翻译:

基于 MCTS 的程序的强度调整和评估 [研究前沿]

本文提出了一种对基于蒙特卡洛树搜索的博弈程序进行强度调整和评估的方法。我们使用强度指数修改现有的 softmax 策略来选择动作。最重要的修改是一种机制,它通过排除模拟计数低于最大模拟计数的预定义阈值比率的移动来过滤低质量移动。通过理论分析,我们表明调整后的策略通过使用阈值比率保证选择超过强度下限的动作。实验结果表明,强度指数与经验强度高度相关。如果指数值介于 ?2 之间,我们可以涵盖大约 800 Elo 评级的强度范围。强度调整和评估方法也在真实世界场景中与人类玩家进行了测试,从专业人士(最强)到 kyu 级业余爱好者(最弱)。对于业余水平,我们在两个流行的围棋在线平台 - Fox Weiqi 和 Tygem 上测试了我们的机制。结果表明,我们的方法可以稳定地将程序强度调整到不同的等级。在实力评估方面,我们提出了一种新的动态实力调整方法,然后用它来评估人类职业选手,可靠地预测他们在 15 场比赛中的比赛实力。最后,我们收集了调查回复,询问玩家关于力量感知、娱乐和不同方面分析的一般评论。据我们所知,这个结果在 Elo 评级的强度范围方面是最先进的,同时保持强度和强度指数之间的可控关系。
更新日期:2020-08-01
down
wechat
bug