Faster sorting algorithms discovered using deep reinforcement learning,Nature

当前位置： X-MOL 学术 › Nature › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Faster sorting algorithms discovered using deep reinforcement learning
Nature ( IF 50.5 ) Pub Date : 2023-06-07 , DOI: 10.1038/s41586-023-06004-9
Daniel J Mankowitz ₁ , Andrea Michi ₁ , Anton Zhernov ₁ , Marco Gelmi ₁ , Marco Selvi ₁ , Cosmin Paduraru ₁ , Edouard Leurent ₁ , Shariq Iqbal ₁ , Jean-Baptiste Lespiau ₁ , Alex Ahern ₁ , Thomas Köppe ₁ , Kevin Millikin ₁ , Stephen Gaffney ₁ , Sophie Elster ₁ , Jackson Broshear ₁ , Chris Gamble ₁ , Kieran Milan ₁ , Robert Tung ₁ , Minjae Hwang ₂ , Taylan Cemgil ₁ , Mohammadamin Barekatain ₁ , Yujia Li ₁ , Amol Mandhane ₁ , Thomas Hubert ₁ , Julian Schrittwieser ₁ , Demis Hassabis ₁ , Pushmeet Kohli ₁ , Martin Riedmiller ₁ , Oriol Vinyals ₁ , David Silver ₁

Affiliation

Fundamental algorithms such as sorting or hashing are used trillions of times on any given day¹. As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past², making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library³. This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.

中文翻译：

使用深度强化学习发现更快的排序算法

排序或散列等基本算法在任何给定的一天都会被使用数万亿次¹。随着计算需求的增长，让这些算法尽可能高效变得至关重要。鉴于过去^{2年取得了显着进展}，进一步提高这些例程的效率已证明对人类科学家和计算方法都具有挑战性。在这里，我们展示了人工智能如何通过发现迄今为止未知的例程来超越当前的技术水平。为实现这一点，我们将寻找更好的排序程序的任务制定为单人游戏。然后我们训练了一个新的深度强化学习代理 AlphaDev 来玩这个游戏。AlphaDev 从零开始发现了优于先前已知人类基准的小型排序算法。这些算法已经集成到 LLVM 标准 C++ 排序库中³. 对这部分排序库的更改表示用使用强化学习自动发现的算法替换组件。我们还展示了额外领域的结果，展示了该方法的通用性。

更新日期：2023-06-08

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11