PSB2: The Second Program Synthesis Benchmark Suite,arXiv - CS - Software Engineering

当前位置： X-MOL 学术 › arXiv.cs.SE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

PSB2: The Second Program Synthesis Benchmark Suite
arXiv - CS - Software Engineering Pub Date : 2021-06-10 , DOI: arxiv-2106.06086
Thomas Helmuth, Peter Kelly

For the past six years, researchers in genetic programming and other program synthesis disciplines have used the General Program Synthesis Benchmark Suite to benchmark many aspects of automatic program synthesis systems. These problems have been used to make notable progress toward the goal of general program synthesis: automatically creating the types of software that human programmers code. Many of the systems that have attempted the problems in the original benchmark suite have used it to demonstrate performance improvements granted through new techniques. Over time, the suite has gradually become outdated, hindering the accurate measurement of further improvements. The field needs a new set of more difficult benchmark problems to move beyond what was previously possible. In this paper, we describe the 25 new general program synthesis benchmark problems that make up PSB2, a new benchmark suite. These problems are curated from a variety of sources, including programming katas and college courses. We selected these problems to be more difficult than those in the original suite, and give results using PushGP showing this increase in difficulty. These new problems give plenty of room for improvement, pointing the way for the next six or more years of general program synthesis research.

中文翻译：

PSB2：第二个程序综合基准套件

在过去的六年中，遗传编程和其他程序合成学科的研究人员使用通用程序合成基准套件来对自动程序合成系统的许多方面进行基准测试。这些问题已被用于朝着通用程序综合的目标取得显着进展：自动创建人类程序员编码的软件类型。许多尝试解决原始基准套件中问题的系统都使用它来演示通过新技术带来的性能改进。随着时间的推移，该套件逐渐过时，阻碍了对进一步改进的准确衡量。该领域需要一组新的更困难的基准问题来超越以前的可能性。在本文中，我们描述了构成新基准套件 PSB2 的 25 个新的通用程序综合基准问题。这些问题来自各种来源，包括编程 katas 和大学课程。我们选择这些问题比原始套件中的问题更难，并使用 PushGP 给出显示难度增加的结果。这些新问题提供了很大的改进空间，为未来六年或更长时间的通用程序综合研究指明了方向。

更新日期：2021-06-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>