Genetic Programming and Reinforcement Learning on Learning Heuristics for Dynamic Scheduling: A Preliminary Comparison,IEEE Computational Intelligence Magazine

当前位置： X-MOL 学术 › IEEE Comput. Intell. Mag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Genetic Programming and Reinforcement Learning on Learning Heuristics for Dynamic Scheduling: A Preliminary Comparison
IEEE Computational Intelligence Magazine ( IF 9 ) Pub Date : 2024-04-05 , DOI: 10.1109/mci.2024.3363970
Meng Xu ₁ , Yi Mei ₁ , Fangfang Zhang ₁ , Mengjie Zhang ₁

Affiliation

Scheduling heuristics are commonly used to solve dynamic scheduling problems in real-world applications. However, designing effective heuristics can be time-consuming and often leads to suboptimal performance. Genetic programming has been widely used to automatically learn scheduling heuristics. In recent years, reinforcement learning has also gained attention in this field. Understanding their strengths and weaknesses is crucial for developing effective scheduling heuristics. This paper takes a typical genetic programming method and a typical reinforcement learning method in dynamic flexible job shop scheduling for investigation. The results show that the investigated genetic programming algorithm outperforms the studied reinforcement learning method in the examined scenarios. Also, the study reveals that the compared reinforcement learning method is more stable as the amount of training data changes, and the investigated genetic programming method can learn more effective scheduling heuristics as training data increases. Additionally, the study highlights the potential and value of genetic programming in real-world applications due to its good generalization ability and interpretability. Based on the results, this paper suggests using the investigated reinforcement learning method when training data is limited and stable results are required, and using the investigated genetic programming method when training data is sufficient and high interpretability is required.

中文翻译：

动态调度学习启发式的遗传编程和强化学习：初步比较

调度启发法通常用于解决实际应用中的动态调度问题。然而，设计有效的启发式方法可能非常耗时，并且常常会导致性能不佳。遗传编程已广泛用于自动学习调度启发法。近年来，强化学习在该领域也受到关注。了解它们的优点和缺点对于开发有效的调度启发法至关重要。本文以典型的遗传规划方法和典型的强化学习方法在动态柔性车间调度中进行研究。结果表明，在所检查的场景中，所研究的遗传编程算法优于所研究的强化学习方法。此外，研究还表明，随着训练数据量的变化，所比较的强化学习方法更加稳定，并且随着训练数据的增加，所研究的遗传规划方法可以学习更有效的调度启发式方法。此外，该研究强调了遗传编程由于其良好的泛化能力和可解释性而在现实世界应用中的潜力和价值。基于结果，本文建议当训练数据有限且需要稳定的结果时使用研究的强化学习方法，而当训练数据充足且需要高可解释性时使用研究的遗传规划方法。

更新日期：2024-04-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>