当前位置: X-MOL 学术Evol. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using Decomposed Error for Reproducing Implicit Understanding of Algorithms
Evolutionary Computation ( IF 6.8 ) Pub Date : 2024-03-01 , DOI: 10.1162/evco_a_00321
Caitlin A Owen 1 , Grant Dick 1 , Peter A Whigham 1
Affiliation  

Reproducibility is important for having confidence in evolutionary machine learning algorithms. Although the focus of reproducibility is usually to recreate an aggregate prediction error score using fixed random seeds, this is not sufficient. Firstly, multiple runs of an algorithm, without a fixed random seed, should ideally return statistically equivalent results. Secondly, it should be confirmed whether the expected behaviour of an algorithm matches its actual behaviour, in terms of how an algorithm targets a reduction in prediction error. Confirming the behaviour of an algorithm is not possible when using a total error aggregate score. Using an error decomposition framework as a methodology for improving the reproducibility of results in evolutionary computation addresses both of these factors. By estimating decomposed error using multiple runs of an algorithm and multiple training sets, the framework provides a greater degree of certainty about the prediction error. Also, decomposing error into bias, variance due to the algorithm (internal variance), and variance due to the training data (external variance) more fully characterises evolutionary algorithms. This allows the behaviour of an algorithm to be confirmed. Applying the framework to a number of evolutionary algorithms shows that their expected behaviour can be different to their actual behaviour. Identifying a behaviour mismatch is important in terms of understanding how to further refine an algorithm as well as how to effectively apply an algorithm to a problem.



中文翻译:

使用分解的错误来再现算法的隐式理解

可重复性对于对进化机器学习算法有信心非常重要。尽管再现性的重点通常是使用固定随机种子重新创建聚合预测误差分数,但这还不够。首先,在没有固定随机种子的情况下,算法的多次运行在理想情况下应该返回统计上等效的结果。其次,应该确认算法的预期行为是否与其实际行为相匹配,即算法如何以减少预测误差为目标。使用总错误总分时不可能确认算法的行为。使用误差分解框架作为提高进化计算结果的可重复性的方法可以解决这两个因素。通过使用算法的多次运行和多个训练集来估计分解误差,该框架提供了关于预测误差的更大程度的确定性。此外,将误差分解为偏差、算法引起的方差(内部方差)和训练数据引起的方差(外部方差)更充分地表征了进化算法。这使得算法的行为得以确认。将该框架应用于许多进化算法表明,它们的预期行为可能与实际行为不同。识别行为不匹配对于理解如何进一步完善算法以及如何有效地将算法应用于问题而言非常重要。

更新日期:2024-03-02
down
wechat
bug