当前位置: X-MOL 学术Journal of Property Research › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Using shrinkage for data-driven automated valuation model specification – a case study from Berlin
Journal of Property Research Pub Date : 2021-04-13 , DOI: 10.1080/09599916.2021.1905690
Nils Hinrichs 1 , Jens Kolbe , Axel Werwatz 2
Affiliation  

ABSTRACT

We study whether data-driven AVM specification that combines a flexible-yet-simple regression model with shrinkage estimators considerably improves upon the prediction accuracy of a conventional hedonic model. A rolling window prediction comparison based on all condominium sales in Berlin, Germany, between 1996 and 2013 delivered the following results. The highly parameterised model can result in extreme errors if the flexible model, which employs roughly 3,800 variables, is estimated by OLS and even if shrinkage is applied via Ridge regression. Once the most extreme errors are disregarded, Ridge regression appears as the clear winner of the prediction comparison. It is the only procedure that delivers a considerable reduction in the root mean squared prediction error relative to a parsimonious benchmark model (estimated via OLS). Of the two procedures with variable selection capability, Elastic Net delivers a slightly better prediction performance. Lasso, on the other hand, acts considerably more as a selector and typically sets the bulk of the several thousand coefficients to zero. Both procedures largely agree in terms of which characteristics they frequently select: core characteristics of hedonic pricing such as floor space, building age and location dummies.



中文翻译:

将收缩用于数据驱动的自动估值模型规范——来自柏林的案例研究

摘要

我们研究了将灵活而简单的回归模型与收缩估计器相结合的数据驱动的 AVM 规范是否大大提高了传统特征模型的预测精度。基于 1996 年至 2013 年德国柏林所有公寓销售的滚动窗口预测比较得出以下结果。如果使用大约 3,800 个变量的灵活模型由 OLS 估计,并且即使通过岭回归应用收缩,高度参数化的模型也可能导致极端错误。一旦忽略了最极端的错误,岭回归就会成为预测比较的明显赢家。相对于简约基准模型(通过 OLS 估计),它是唯一显着降低均方根预测误差的程序。在具有变量选择能力的两个过程中,Elastic Net 的预测性能稍好一些。另一方面,套索更像是一个选择器,通常将数千个系数中的大部分设置为零。这两个程序在他们经常选择的特征方面基本一致:享乐定价的核心特征,如建筑面积、建筑年龄和位置虚拟。

更新日期:2021-04-13
down
wechat
bug