当前位置: X-MOL 学术Constr. Build. Mater. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Benchmarking AutoML solutions for concrete strength prediction: Reliability, uncertainty, and dilemma
Construction and Building Materials ( IF 7.4 ) Pub Date : 2024-03-19 , DOI: 10.1016/j.conbuildmat.2024.135782
Mohammad Amin Hariri-Ardebili , Parsa Mahdavi , Farhad Pourkamali-Anaraki

Building precise machine learning and deep learning models has traditionally required a combination of mathematical skills and hands-on experience to meticulously adjust hyperparameters that significantly impact the learning process. As datasets continue to expand across various engineering domains, researchers increasingly turn to machine learning methods to uncover hidden insights that may elude classic regression techniques. This surge in adoption raises concerns about the adequacy of resultant meta-models and the interpretation of findings. In response to these challenges, automated machine learning (AutoML) emerges as a promising solution, aiming to construct machine learning models with minimal intervention or guidance from human experts. This paper benchmarks AutoML solutions by providing an overview of their principles and applying them to predict the most important mechanical properties of different concrete datasets, i.e., compressive strength. Nine datasets from various concrete types, sample sizes, and features are utilized, with a detailed discussion on the benchmark dataset from high-performance concrete, applying best practices to the other eight datasets. For each case, the importance of hyperparameter tuning is discussed, alongside the ensemble and stacking models. Tree-based models are employed for each dataset to develop SHAP plots, interpret results, and understand the contribution of each component in the mix design to the overall strength of the concrete. This paper further explores three unique aspects of benchmarking AutoML in material science: (1) “reliability” by contrasting the benchmark dataset’s error metric with literature collected over the past 20 years, (2) “uncertainty” by quantifying the variability in the mean and standard deviation of the error metric from different datasets and its correlation with the sample-to-feature ratio, and (3) “dilemma” by discussing the shortcomings of AutoML in specific concrete datasets.

中文翻译:

混凝土强度预测的 AutoML 解决方案基准测试:可靠性、不确定性和困境

传统上,构建精确的机器学习和深度学习模型需要结合数学技能和实践经验,以精心调整对学习过程产生重大影响的超参数。随着数据集不断扩展到各个工程领域,研究人员越来越多地转向机器学习方法,以揭示可能逃避经典回归技术的隐藏见解。采用率的激增引起了人们对由此产生的元模型的充分性和研究结果的解释的担忧。为了应对这些挑战,自动化机器学习(AutoML)作为一种有前景的解决方案应运而生,旨在以最少的人类专家干预或指导来构建机器学习模型。本文通过概述 AutoML 解决方案的原理并应用它们来预测不同混凝土数据集最重要的机械性能(即抗压强度)来对 AutoML 解决方案进行基准测试。利用来自不同混凝土类型、样本大小和特征的九个数据集,详细讨论了高性能混凝土的基准数据集,并将最佳实践应用于其他八个数据集。对于每种情况,都讨论了超参数调整的重要性以及集成和堆叠模型。每个数据集都采用基于树的模型来开发 SHAP 图、解释结果并了解配合比设计中每个组件对混凝土整体强度的贡献。本文进一步探讨了材料科学中 AutoML 基准测试的三个独特方面:(1) 通过将基准数据集的误差指标与过去 20 年收集的文献进行对比来实现“可靠性”,(2) 通过量化平均值和平均值的变异性来实现“不确定性”。不同数据集的误差度量标准差及其与样本特征比的相关性,以及(3)“困境”,讨论 AutoML 在特定具体数据集中的缺点。
更新日期:2024-03-19
down
wechat
bug