当前位置: X-MOL 学术Genet. Program. Evolvable Mach. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Applying genetic programming to PSB2: the next generation program synthesis benchmark suite
Genetic Programming and Evolvable Machines ( IF 2.6 ) Pub Date : 2022-06-01 , DOI: 10.1007/s10710-022-09434-y
Thomas Helmuth , Peter Kelly

For the past seven years, researchers in genetic programming and other program synthesis disciplines have used the General Program Synthesis Benchmark Suite (PSB1) to benchmark many aspects of systems that conduct programming by example, where the specifications of the desired program are given as input/output pairs. PSB1 has been used to make notable progress toward the goal of general program synthesis: automatically creating the types of software that human programmers code. Many of the systems that have attempted the problems in PSB1 have used it to demonstrate performance improvements granted through new techniques. Over time, the suite has gradually become outdated, hindering the accurate measurement of further improvements. The field needs a new set of more difficult benchmark problems to move beyond what was previously possible and ensure that systems do not overfit to one benchmark suite. In this paper, we describe the 25 new general program synthesis benchmark problems that make up PSB2, a new benchmark suite. These problems are curated from a variety of sources, including programming katas and college courses. We selected these problems to be more difficult than those in the original suite, and give results using PushGP showing this increase in difficulty. We additionally give an example of benchmarking using a state-of-the-art parent selection method, showing improved performance on PSB2 while still leaving plenty of room for improvement. These new problems will help guide program synthesis research for years to come.



中文翻译:

将遗传编程应用于 PSB2:下一代程序合成基准套件

在过去的七年里,遗传编程和其他程序合成学科的研究人员使用通用程序合成基准套件 (PSB1) 对通过示例进行编程的系统的许多方面进行基准测试,其中所需程序的规范作为输入/输出对。PSB1 已被用于在实现通用程序综合目标方面取得显着进展:自动创建人类程序员编写的软件类型。许多尝试解决 PSB1 问题的系统都使用它来展示通过新技术获得的性能改进。随着时间的推移,该套件逐渐过时,阻碍了进一步改进的准确衡量。该领域需要一组新的更困难的基准问题来超越以前可能的问题,并确保系统不会过度适应一个基准套件。在本文中,我们描述了构成新基准套件 PSB2 的 25 个新的通用程序综合基准问题。这些问题是从各种来源策划的,包括编程 katas 和大学课程。我们选择了比原始套件中更难的这些问题,并使用 PushGP 给出的结果显示了难度的增加。我们还给出了一个使用最先进的父选择方法进行基准测试的示例,显示了在 PSB2 上的改进性能,同时仍有很大的改进空间。这些新问题将有助于指导未来几年的程序综合研究。

更新日期:2022-06-02
down
wechat
bug