当前位置: X-MOL 学术Stat. Anal. Data Min. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Boosting diversity in regression ensembles
Statistical Analysis and Data Mining ( IF 1.3 ) Pub Date : 2023-12-30 , DOI: 10.1002/sam.11654
Mathias Bourel 1 , Jairo Cugliari 2 , Yannig Goude 3 , Jean‐Michel Poggi 4
Affiliation  

Ensemble methods, such as Bagging, Boosting, or Random Forests, often enhance the prediction performance of single learners on both classification and regression tasks. In the context of regression, we propose a gradient boosting-based algorithm incorporating a diversity term with the aim of constructing different learners that enrich the ensemble while achieving a trade-off of some individual optimality for global enhancement. Verifying the hypotheses of Biau and Cadre's theorem (2021, Advances in contemporary statistics and econometrics—Festschrift in honour of Christine Thomas-Agnan, Springer), we present a convergence result ensuring that the associated optimization strategy reaches the global optimum. In the experiments, we consider a variety of different base learners with increasing complexity: stumps, regression trees, Purely Random Forests, and Breiman's Random Forests. Finally, we consider simulated and benchmark datasets and a real-world electricity demand dataset to show, by means of numerical experiments, the suitability of our procedure by examining the behavior not only of the final or the aggregated predictor but also of the whole generated sequence.

中文翻译:

提高回归集成的多样性

集成方法(例如 Bagging、Boosting 或随机森林)通常可以增强单个学习器在分类和回归任务上的预测性能。在回归的背景下,我们提出了一种基于梯度提升的算法,该算法结合了多样性项,旨在构建不同的学习器来丰富集成,同时实现全局增强的某些个体最优性的权衡。验证 Biau 和 Cadre 定理的假设(2021,当代统计和计量经济学的进展 - 纪念 Christine Thomas-Agnan 的 Festschrift,Springer),我们提出了一个收敛结果,确保相关的优化策略达到全局最优。在实验中,我们考虑了复杂性不断增加的各种不同的基础学习器:树桩、回归树、纯随机森林和 Breiman 随机森林。最后,我们考虑模拟和基准数据集以及现实世界的电力需求数据集,通过数值实验,通过检查最终或聚合预测器以及整个生成序列的行为来显示我们的程序的适用性。
更新日期:2023-12-30
down
wechat
bug