Journal of Property Research Pub Date : 2022-05-24 , DOI: 10.1080/09599916.2022.2070525 Anders Hjort 1, 2 , Johan Pensar 1 , Ida Scheel 1 , Dag Einar Sommervoll 2, 3
ABSTRACT
Many banks and credit institutions are required to assess the value of dwellings in their mortgage portfolio. This valuation often relies on an Automated Valuation Model (AVM). Moreover, these institutions often report the models accuracy by two numbers: The fraction of predictions within and range from the true values. Until recently, AVMs tended to be hedonic regression models, but lately machine learning approaches like random forest and gradient boosted trees have been increasingly applied. Both the traditional approaches and the machine learning approaches rely on minimising mean squared prediction error, and not the number of predictions in the and range. We investigate whether introducing a loss function closer to the AVMs actual loss measure improves performance in machine learning approaches, specifically for a gradient boosted tree approach. This loss function yields an improvement from to of predictions within of the true value on a data set of transactions from the Norwegian housing market between 2013 and 2015, with the biggest improvements in performance coming from the lower price segments. We also find that a weighted average of models with different loss functions improves performance further, yielding of the observations within of the true value.
中文翻译:
不同损失函数下梯度提升树的房价预测
摘要
许多银行和信贷机构都需要评估其抵押贷款组合中的住宅价值。这种估值通常依赖于自动估值模型 (AVM)。此外,这些机构通常用两个数字报告模型的准确性:和范围从真实值。直到最近,AVM 往往是特征回归模型,但最近机器学习方法,如随机森林和梯度提升树已经越来越多地应用。传统方法和机器学习方法都依赖于最小化均方预测误差,而不是预测的数量和范围。我们研究了引入更接近 AVM 实际损失度量的损失函数是否可以提高机器学习方法的性能,特别是对于梯度提升树方法。这个损失函数产生了从至内的预测数据集上的真值2013 年至 2015 年间来自挪威房地产市场的交易,其中表现的最大改善来自较低的价格段。我们还发现,具有不同损失函数的模型的加权平均值进一步提高了性能,产生内的观察的真实价值。