当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Genetic Programming Based Automated Machine Learning in Classifying ESG Performances
IEEE Access ( IF 3.9 ) Pub Date : 2024-04-25 , DOI: 10.1109/access.2024.3393511
Abdullah Sani Abd Rahman 1 , Suraya Masrom 2 , Rahayu Abdul Rahman 3 , Roslina Ibrahim 4 , Abdul Rehman Gilal 5
Affiliation  

AutoML offers significant benefits in solving real-life problems because it accelerates the development of machine learning models. In contexts involving real scenarios like analyzing companies’ environmental, social and governance (ESG), where the dataset presents some challenges, AutoML is anticipated as a promising solution to address these complexities. Although researchers have shown significant interest in exploring Genetic Programming (GP) in AutoML for handling complex datasets, a critical issue that remains unresolved is the comprehensive understanding of GP hyper-parameters that influence machine learning performance. While GP-based AutoML excels in automating many aspects of the modelling, there has been a scarcity of research that provides insight into the significance of individual features and GP population size within the models of GP-based AutoML. This paper presents a comprehensive analysis of the models’ performance evaluation from multiple facets, including feature selection, GP population sizes, and different machine learning algorithms. Furthermore, this study provides insights into the association between Pearson correlations, machine learning performance, and the importance of machine learning features. The findings demonstrate that incorporating all the determinants as features in GP-based AutoML or relying solely on firm characteristics led to superior performance with an excellent trade-off between True Positive Rate and False Positive Rate. Thus, higher accuracy results exceeding 0.9 of Area Under the Curve (AUC) are presented by the proposed models. The novelty of this study lies in its empirical evaluation of different approaches to GP-based AutoML implementation. These findings provide alternative solutions for business investors to identify companies with strong sustainability practices.

中文翻译:

基于基因编程的自动机器学习在 ESG 表现分类中的应用

AutoML 在解决现实生活问题方面提供了显着的优势,因为它加速了机器学习模型的开发。在涉及分析公司环境、社会和治理 (ESG) 等真实场景的背景下,数据集带来了一些挑战,AutoML 有望成为解决这些复杂性的有前途的解决方案。尽管研究人员对探索 AutoML 中的遗传编程 (GP) 来处理复杂数据集表现出了浓厚的兴趣,但仍未解决的一个关键问题是对影响机器学习性能的 GP 超参数的全面理解。虽然基于 GP 的 AutoML 在建模的许多方面实现自动化方面表现出色,但很少有研究能够深入了解基于 GP 的 AutoML 模型中各个特征和 GP 群体规模的重要性。本文从特征选择、GP群体规模和不同机器学习算法等多个方面对模型的性能评估进行了全面分析。此外,这项研究还深入了解了皮尔逊相关性、机器学习性能和机器学习特征的重要性之间的关联。研究结果表明,将所有决定因素作为特征纳入基于 GP 的 AutoML 中或仅依赖公司特征可带来卓越的性能,并在真阳性率和假阳性率之间实现出色的权衡。因此,所提出的模型给出了超过 0.9 的曲线下面积 (AUC) 的更高准确度结果。这项研究的新颖之处在于它对基于 GP 的 AutoML 实现的不同方法进行了实证评估。这些发现为商业投资者提供了替代解决方案,以识别具有强大可持续发展实践的公司。
更新日期:2024-04-25
down
wechat
bug