当前位置: X-MOL 学术Genet. Program. Evolvable Mach. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Alleviating overfitting in transformation-interaction-rational symbolic regression with multi-objective optimization
Genetic Programming and Evolvable Machines ( IF 2.6 ) Pub Date : 2023-10-20 , DOI: 10.1007/s10710-023-09461-3
Fabrício Olivetti de França

The Transformation-Interaction-Rational is a representation for symbolic regression that limits the search space of functions to the ratio of two nonlinear functions each one defined as the linear regression of transformed variables. This representation has the main objective to bias the search towards simpler expressions while keeping the approximation power of standard approaches. The performance of using Genetic Programming with this representation was substantially better than with its predecessor (Interaction-Transformation) and ranked close to the state-of-the-art on a contemporary Symbolic Regression benchmark. On a closer look at these results, we observed that the performance could be further improved with an additional selective pressure for smaller expressions when the dataset contains just a few data points. The introduction of a penalization term applied to the fitness measure improved the results on these smaller datasets. One problem with this approach is that it introduces two additional hyperparameters: (i) a criterion for when the penalization should be activated and, (ii) the amount of penalization to the fitness function. One possible solution to alleviate this additional burden of correctly setting these hyperparameters is to pose the search as a multi-objective optimization problem by minimizing the approximation error and the expression size. The main idea is that the selective pressure of finding non-dominating solutions will return the simplest model for each particular approximation error in the pareto front. In this paper, we extend Transformation-Interaction-Rational to support multi-objective optimization, specifically the NSGA-II algorithm, and apply that to the same benchmark. A detailed analysis of the results show that the use of multi-objective optimization benefits the overall performance on a subset of the benchmarks while keeping the results similar to the single-objective approach on the remainder of the datasets. Specifically to the small datasets, we observe a small (and statistically insignificant) improvement of the results suggesting that further strategies must be explored.



中文翻译:

通过多目标优化减轻变换交互理性符号回归中的过度拟合

变换交互有理数是符号回归的一种表示,它将函数的搜索空间限制为两个非线性函数的比率,每个函数都定义为变换变量的线性回归。这种表示的主要目标是使搜索偏向于更简单的表达式,同时保持标准方法的近似能力。使用具有这种表示形式的遗传编程的性能比其前身(交互转换)要好得多,并且在当代符号回归基准上排名接近最先进。仔细观察这些结果后,我们发现,当数据集仅包含几个数据点时,通过对较小表达式施加额外的选择压力,可以进一步提高性能。引入应用于适应度测量的惩罚项改善了这些较小数据集的结果。这种方法的一个问题是它引入了两个额外的超参数:(i)何时应激活惩罚的标准,以及(ii)适应度函数的惩罚量。减轻正确设置这些超参数的额外负担的一种可能的解决方案是通过最小化近似误差和表达式大小,将搜索视为多目标优化问题。主要思想是,寻找非支配解的选择压力将为帕累托前沿中的每个特定逼近误差返回最简单的模型。在本文中,我们扩展了 Transformation-Interaction-Rational 以支持多目标优化,特别是 NSGA-II 算法,并将其应用于相同的基准测试。对结果的详细分析表明,使用多目标优化有利于基准子集的整体性能,同时在其余数据集上保持与单目标方法相似的结果。特别是对于小型数据集,我们观察到结果有微小的(且统计上不显着的)改善,这表明必须探索进一步的策略。

更新日期:2023-10-20
down
wechat
bug