Exploring the True Potential: Evaluating the Black-box Optimization Capability of Large Language Models,arXiv - CS - Neural and Evolutionary Computing

当前位置： X-MOL 学术 › arXiv.cs.NE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Exploring the True Potential: Evaluating the Black-box Optimization Capability of Large Language Models
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2024-04-09 , DOI: arxiv-2404.06290
Beichen Huang, Xingyu Wu, Yu Zhou, Jibin Wu, Liang Feng, Ran Cheng, Kay Chen Tan

Large language models (LLMs) have gained widespread popularity and demonstrated exceptional performance not only in natural language processing (NLP) tasks but also in non-linguistic domains. Their potential as artificial general intelligence extends beyond NLP, showcasing promising capabilities in diverse optimization scenarios. Despite this rising trend, whether the integration of LLMs into these black-box optimization problems is genuinely beneficial remains unexplored. This paper endeavors to tackle this issue by offering deeper insights into the potential of LLMs in optimization tasks through a comprehensive investigation. Our approach involves a comprehensive evaluation, covering both discrete and continuous optimization problems, aiming to assess the efficacy and distinctive characteristics that LLMs bring to the realm of optimization. Our findings reveal both the limitations and advantages of LLMs in optimization. On one hand, despite consuming the significant power required to run the model, LLMs exhibit subpar performance and lack desirable properties in pure numerical tasks, primarily due to a mismatch between the problem domain and their processing capabilities. On the other hand, although LLMs may not be ideal for traditional numerical optimization, their potential in broader optimization contexts remains promising. LLMs exhibit the ability to solve problems in non-numerical domains and can leverage heuristics from the prompt to enhance their performance. To the best of our knowledge, this work presents the first systematic evaluation of LLMs for numerical optimization, offering a progressive, wide-coverage, and behavioral analysis. Our findings pave the way for a deeper understanding of LLMs' role in optimization and guide future application in diverse scenarios for LLMs.

中文翻译：

探索真正的潜力：评估大型语言模型的黑盒优化能力

大型语言模型 (LLM) 已获得广泛流行，并且不仅在自然语言处理 (NLP) 任务中而且在非语言领域也表现出了卓越的性能。它们作为通用人工智能的潜力超越了 NLP，在不同的优化场景中展示了有前景的能力。尽管有这种上升趋势，但将法学硕士融入这些黑盒优化问题是否真正有益仍有待探索。本文试图通过全面的调查，对法学硕士在优化任务中的潜力提供更深入的见解，从而解决这个问题。我们的方法涉及全面评估，涵盖离散和连续优化问题，旨在评估法学硕士为优化领域带来的功效和独特特征。我们的研究结果揭示了法学硕士在优化方面的局限性和优势。一方面，尽管运行模型需要消耗大量能量，但法学硕士在纯数值任务中表现不佳，并且缺乏理想的属性，这主要是由于问题领域与其处理能力之间的不匹配。另一方面，尽管法学硕士对于传统的数值优化可能并不理想，但它们在更广泛的优化环境中的潜力仍然充满希望。法学硕士表现出解决非数字领域问题的能力，并且可以利用提示中的启发式方法来提高其表现。据我们所知，这项工作首次对法学硕士进行了数值优化的系统评估，提供了渐进的、广泛的覆盖范围和行为分析。我们的研究结果为更深入地了解法学硕士在优化中的作用铺平了道路，并指导法学硕士未来在不同场景中的应用。

更新日期：2024-04-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>