Automatic piecewise linear regression,Computational Statistics

当前位置： X-MOL 学术 › Comput. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Automatic piecewise linear regression
Computational Statistics ( IF 1.3 ) Pub Date : 2024-03-01 , DOI: 10.1007/s00180-024-01475-4
Mathias von Ottenbreit , Riccardo De Bin

Regression modelling often presents a trade-off between predictiveness and interpretability. Highly predictive and popular tree-based algorithms such as Random Forest and boosted trees predict very well the outcome of new observations, but the effect of the predictors on the result is hard to interpret. Highly interpretable algorithms like linear effect-based boosting and MARS, on the other hand, are typically less predictive. Here we propose a novel regression algorithm, automatic piecewise linear regression (APLR), that combines the predictiveness of a boosting algorithm with the interpretability of a MARS model. In addition, as a boosting algorithm, it automatically handles variable selection, and, as a MARS-based approach, it takes into account non-linear relationships and possible interaction terms. We show on simulated and real data examples how APLR’s performance is comparable to that of the top-performing approaches in terms of prediction, while offering an easy way to interpret the results. APLR has been implemented in C++ and wrapped in a Python package as a Scikit-learn compatible estimator.

中文翻译：

自动分段线性回归

回归模型通常在预测性和可解释性之间进行权衡。高度预测和流行的基于树的算法（例如随机森林和提升树）可以很好地预测新观察的结果，但预测变量对结果的影响很难解释。另一方面，高度可解释的算法（例如基于线性效应的提升和 MARS）通常预测能力较差。在这里，我们提出了一种新颖的回归算法，即自动分段线性回归（APLR），它将增强算法的预测性与 MARS 模型的可解释性相结合。此外，作为增强算法，它会自动处理变量选择，并且作为基于 MARS 的方法，它会考虑非线性关系和可能的交互项。我们通过模拟和真实数据示例展示了 APLR 的性能如何与预测方面表现最佳的方法相媲美，同时提供了一种简单的方法来解释结果。APLR 已用 C++ 实现，并作为 Scikit-learn 兼容估计器包装在 Python 包中。

更新日期：2024-03-03

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>